Bug 1306 - Need better detection of when a mimetype is text or binary
Summary: Need better detection of when a mimetype is text or binary
Alias: None
Product: Xamarin Studio
Classification: Desktop
Component: General ()
Version: Trunk
Hardware: PC Mac OS
: Low normal
Target Milestone: ---
Assignee: Bugzilla
Depends on:
Reported: 2011-10-06 11:02 UTC by Alan McGovern
Modified: 2018-01-22 09:55 UTC (History)
6 users (show)

Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.

Please create a new report on Developer Community or GitHub with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:

Description Alan McGovern 2011-10-06 11:02:01 UTC
Maybe we should have an isText=maybe on mimetypes and do run a detection pass on maybes and files without a known mimetype to see if they are binary or text.

Something like:
DesktopService.FileIsText (string filename) {
    var mimetype = GetMimetype (filename);
    bool? isText = MimeTypeIsText (mimetype);
    if (isText.HasValue)
        return istext.value;
        return DetectIsText  (filename);
Comment 1 Mike Krüger 2011-11-15 14:37:00 UTC
Why ?

You can read fales with TextFile.Read ... or something.

You are >right< that we should do it that way, when I'll rework the textfile stuff I'll do it. The text file implemenation currently uses the gtk encoding routines .... which is really junk, because .NET has own encoding routines we can use.
Comment 2 Mikayla Hutchinson [MSFT] 2011-11-16 15:41:41 UTC
This isn't just a problem for unknown filetypes, it's a problem for plists, because they can be text or binary. That's why MimeTypeIsText can't just be boolean, it needs to fall back to detection like git does (binary if null in first 40k).
Comment 3 Mike Krüger 2012-03-30 08:49:08 UTC
The new TextFileUtility class in the text editor will do it :)

(btw. renamed from TextFileReader because of name clashes & it does more than just read files)
Comment 4 Mikayla Hutchinson [MSFT] 2012-04-02 22:33:19 UTC
We have the function, but we're not using it in the mime services yet.
Comment 5 Mike Krüger 2012-04-03 09:24:49 UTC
Then use it I use it:

+ For text file loading 
+ In the git version control (for getting specific file content)

The bug is about the function - not the use. Open bugs about the non working subsystems.
Comment 6 Mike Krüger 2012-04-03 09:26:25 UTC
btw. mime types are the wrong thing for that - mime types can be misleading about text or binary - they're recognized by the ending.

Just think of the .plist files.
Comment 7 Mikayla Hutchinson [MSFT] 2012-04-03 10:48:21 UTC
Comment 8 Mike Krüger 2012-04-11 03:36:54 UTC
the display bindings need to be changed.

Letting 10 display bindings always ask if file is binary is nonsense - how about 

bool CanHandle (FilePath fileName, string mimeType, Project ownerProject);

bool CanHandle (FilePath fileName, string mimeType, bool isBinary, Project ownerProject);
Comment 9 Mike Krüger 2012-05-24 06:10:14 UTC
Comment 10 Mike Krüger 2012-05-24 13:54:59 UTC
fix got reverted - maybe we revisit that later.
Comment 11 Marius Ungureanu 2013-11-23 10:35:09 UTC
What is the status of this?
Comment 12 Mike Krüger 2013-11-23 13:35:12 UTC
Would need some design work on our infrastructure - maybe a discussion topic for next tuesday ?
Comment 13 Amy Burns 2017-07-21 17:58:55 UTC
What is the status of this?
Comment 14 Mike Krüger 2018-01-22 09:55:53 UTC
Was fixed a long time ago :) 

In current vs 4 mac opening a binary file should default to the hex editor and a text editor should open on text.

However nothing is 100% so some binary files may open as text and vice versa - depending on the file, encoding and so on.
However it should work reliable.

btw. we've the function in desktop service:

GetFileIsText (string file, string mimeType = null)

As Alan proposed.