Enhanced Document Class Detection...
Posted: Sat Mar 07, 2009 7:44 am
I would like to be able to have the document class of a file detected based on file content. For example, Unix/Linux/Cygwin is able to detect different types of script files based on the comment in the first line. Also, most well formed XML and HTML based documents use a DOCTYPE in the first line.
Some of the Unix/Linux desktops take this a bit further by supporting methods to define and detect MIME types. This last time I did anything with this it was on Gnome, and the files were kept in /usr/share/mime-info and /usr/share/application-registry. (This may have moved because at that time (3+ years ago) the Linux community was really trying to standardize this stuff.)
I don't know how hard it is to implement the code to parse the files that I used. I do know that for a lot of file types, the mime-info and application-registry files were easy to create. Also, if you choose to use the same format, you could use all the existing files. (You could also take advantage of the open source code if you are so inclined.)
I would mostly like this for the zillions of files I edit that have no extension. However, it would also be nice if it could optionally override the extension based doc-class setting.
Some of the Unix/Linux desktops take this a bit further by supporting methods to define and detect MIME types. This last time I did anything with this it was on Gnome, and the files were kept in /usr/share/mime-info and /usr/share/application-registry. (This may have moved because at that time (3+ years ago) the Linux community was really trying to standardize this stuff.)
I don't know how hard it is to implement the code to parse the files that I used. I do know that for a lot of file types, the mime-info and application-registry files were easy to create. Also, if you choose to use the same format, you could use all the existing files. (You could also take advantage of the open source code if you are so inclined.)
I would mostly like this for the zillions of files I edit that have no extension. However, it would also be nice if it could optionally override the extension based doc-class setting.