TFindFile Content Search Bug

Please post bug reports, feature requests, or any question regarding the DELPHI AREA products here.

TFindFile Content Search Bug

Postby softboy99 » May 6th, 2009, 9:06 am

i serach filename=*.doc

content = '123'

find some files which doesn't contain '123'.


also
i search some asian language strings(eg:Chinese simp),cannot find any result.



i use Delphi 2007 + tnt unicode v2.0.3, and your unidcode version TFindFile.
softboy99
Active Member
Active Member
 
Posts: 7
Joined: April 29th, 2009, 10:20 am

Re: TFindFile Content Search Bug

Postby softboy99 » May 6th, 2009, 9:50 am

Code: Select all
function FileContainsPhrase(const FileName: WideString;
  const Phrase: TStringVariants; Options: TContentSearchOptions): Boolean;
const
  UNICODE_BOM: WideChar = #$FEFF;
  UNICODE_BOM_SWAPPED: WideChar = #$FFFE;
  UTF8_BOM: array[1..3] of AnsiChar = (#$EF, #$BB, #$BF);
var
  Stream: TFileStream;
  BOMBytes: array[1..3] of AnsiChar;
  BOM: WideChar absolute BOMBytes;
  BOMSize: Integer;
begin
  Stream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyNone);
  try
    BOMSize := Stream.Read(BOM, SizeOf(BOM));
    if BOMSize = SizeOf(BOM) then
    begin
      if (BOM = UNICODE_BOM) or (BOM = UNICODE_BOM_SWAPPED) then
      begin
        Result := StreamContainsPhraseWide(Stream, PWideChar(Phrase.Unicode),
          Length(Phrase.Unicode), Options, BOM = UNICODE_BOM_SWAPPED);
        Exit;
      end
      else if (BOMBytes[1] = UTF8_BOM[1]) and (BOMBytes[2] = UTF8_BOM[2]) then
      begin
        Inc(BOMSize, Stream.Read(BOMBytes[3], SizeOf(BOMBytes[3])));
        if (BOMSize = SizeOf(UTF8_BOM)) and (BOMBytes[3] = UTF8_BOM[3]) then
        begin
          Result := StreamContainsPhraseUtf8(Stream, PAnsiChar(Phrase.Utf8),
            Length(Phrase.Utf8), Options);
          Exit;
        end;
      end;
    end;
    Stream.Seek(-BOMSize, soFromCurrent);
    Result := StreamContainsPhraseAnsi(Stream, PAnsiChar(Phrase.Ansi),
      Length(Phrase.Ansi), Options);
  finally
    Stream.Free;
  end;
end;


the problem is in BOM = UNICODE_BOM, It seems that the BOM never equals to UNICODE_BOM when you use chinese simpied string. so only StreamContainsPhraseAnsi is called.
softboy99
Active Member
Active Member
 
Posts: 7
Joined: April 29th, 2009, 10:20 am

Re: TFindFile Content Search Bug

Postby Kambiz » May 6th, 2009, 10:03 am

Please attach one of your text files, so that I can trace the issue.
Kambiz
User avatar
Kambiz
Administrator
Administrator
 
Posts: 2408
Joined: March 7th, 2003, 7:10 pm

Re: TFindFile Content Search Bug

Postby softboy99 » May 7th, 2009, 12:42 am

search content ='机密'
Attachments
serachtarget.doc
(22.5 KiB) Downloaded 53 times
softboy99
Active Member
Active Member
 
Posts: 7
Joined: April 29th, 2009, 10:20 am

Re: TFindFile Content Search Bug

Postby softboy99 » May 7th, 2009, 12:59 am

http://www.totalcmd.net/plugring/TextSearch.html


you can refer to the solutions at the above link.
softboy99
Active Member
Active Member
 
Posts: 7
Joined: April 29th, 2009, 10:20 am

Re: TFindFile Content Search Bug

Postby softboy99 » May 7th, 2009, 1:01 am

http://www.totalcmd.net/plugring/xPDFSearch.html


another functionality about search in PDF Files.
softboy99
Active Member
Active Member
 
Posts: 7
Joined: April 29th, 2009, 10:20 am

Re: TFindFile Content Search Bug

Postby Kambiz » May 7th, 2009, 2:38 am

I just got what you mean.

That's not a bug but a limitation. The component does a plain search for none text files.
Kambiz
User avatar
Kambiz
Administrator
Administrator
 
Posts: 2408
Joined: March 7th, 2003, 7:10 pm

Re: TFindFile Content Search Bug

Postby softboy99 » May 8th, 2009, 2:17 am

oh, so bad.could you please add the ability to serach content in the non plaint text file?


i have attache the total commander's solution which is open source in my prievous post here.
softboy99
Active Member
Active Member
 
Posts: 7
Joined: April 29th, 2009, 10:20 am


Return to DELPHI AREA Products

Who is online

Users browsing this forum: No registered users and 1 guest

cron