I'm trying to pass the handle returned from u_fgetfile into fseek/fread functions.
When linking my application with the debug runtime libraries (/MTd /MDd) there is no crash, but if I link against the static versions this simple code crashes:
#include <stdio.h>
#include "unicode\ustdio.h"
int main()
{
UFILE* file;
file = u_fopen("C:\\test.txt","r",NULL,"UTF-8");
fseek(u_fgetfile(file),3,SEEK_SET);
}
Now this happens with both official builds of ICU and when I build custom builds with Visual Studio 2012 (building ICU in debug or release doesn't matter).
The only thing I have found out is that there seems to be some mismatch in the FILE structure, but I really don't know.
Edit:
As part of adding a bounty to this question, here's a fully functional VS2012 project containing both reproducer program (same as the code posted above) and icu with source and binaries. Get it here: http://goo.gl/urTuU
It seems to me like the issue is within _lock_file where it says:
/*
* The way the FILE (pointed to by pf) is locked depends on whether
* it is part of _iob[] or not
*/
if ( (pf >= _iob) && (pf <= (&_iob[_IOB_ENTRIES-1])) )
{
/*
* FILE lies in _iob[] so the lock lies in _locktable[].
*/
_lock( _STREAM_LOCKS + (int)(pf - _iob) );
/* We set _IOLOCKED to indicate we locked the stream */
pf->_flag |= _IOLOCKED;
}
else
/*
* Not part of _iob[]. Therefore, *pf is a _FILEX and the
* lock field of the struct is an initialized critical
* section.
*/
EnterCriticalSection( &(((_FILEX *)pf)->lock) );
A "normal" FILE* will enter the top branch, the pointer returned from u_fgetfile will enter the bottom branch. Here it is assumed that it is a _FILEX*, which is most likely simply not correct.
As we see, the runtime compares to see if the file pointer fb is within _iob. But, in the debugger, we can see clearly that it is far outside of it (at least in the release build).
Given that u_fgetfile just returns a FILE* that was stored within the UFILE structure, we can inspect finit_owner in ufile.c to see how the FILE* ends up in our structure in the first place. After reading that code, I must assume that in a release build, two separate instances of the _iob array exist in the CRT, but in the debug build, only a single instance exists.
To get around this problem, you're going to want to make sure that the FILE* is created in the same thread as your main application. To do that, you can utilize u_finit, like so:
FILE* filePointer = fopen("test.txt","r");
UFILE* file = u_finit(filePointer,NULL,"UTF-8");
fseek(filePointer,3,SEEK_SET); // <- won't crash
Regarding your issue that came up after this, it seems to me like the underlying problem is simply sharing a FILE* between libraries, which fails because they have separate storage areas for FILE*. I find this somewhat confusing, but I don't have the necessary understanding of the involved components (and the style of the Windows C runtime code isn't helping either).
So, if the FILE* is allocated in ICU, then you can't lock it in your main application and vice versa (and trying to read or seek will always involve locking).
Unless there is a very obvious solution to this problem, which I'm missing, I would recommend emulating the behavior of u_fgets() (or whatever else you'll need) in your main application.
From what I can tell, u_fgets() just calls fread() to read data from the file and then uses ucnv_toUnicode(), with the converter stored in the UFILE (which you can retrieve with u_fgetConverter()), to convert the read data into a UChar*.
One way that seems to work is linking ICU statically. I don't know if that is an option for you, but it seems to resolve the issue on my end.
I downloaded the latest release of ICU (51.2) and compiled it with this helpful script. I then linked the project against the libraries in icu-release-static-win32-vs2012 (link with sicuuc.lib, sicuio.lib, sicudt.lib, sicuin.lib).
Now u_fgets() no longer causes an access violation. Of course, now my .exe is almost 23 MB big.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With