TParser is a class that is used internally by the Delphi and C++Builder IDEs to parse DFM (form)
files into a binary format. This limits it a little in its everyday usage (as it is looking for correctly formatted,
text-form DFM files), but can still help out a lot. TParser is perhaps a bit inappropriately named. It should probably
be called TLexer or something similar that is more suited to its function. TParser , unlike
other parsers (by definition), does not process data grabbed from a file; it merely breaks it up into tokens (like
a lexer, hence the misnomer). Here is its definition from Classes.hpp :
class DELPHICLASS TParser;
class PASCALIMPLEMENTATION TParser : public System::TObject
{
typedef System::TObject inherited;
private:
TStream* FStream;
int FOrigin;
char *FBuffer;
char *FBufPtr;
char *FBufEnd;
char *FSourcePtr;
char *FSourceEnd;
char *FTokenPtr;
char *FStringPtr;
int FSourceLine;
char FSaveChar;
char FToken;
char FFloatType;
WideString FWideStr;
void __fastcall ReadBuffer(void);
void __fastcall SkipBlanks(void);
public:
__fastcall TParser(TStream* Stream);
__fastcall virtual ~TParser(void);
void __fastcall CheckToken(char T);
void __fastcall CheckTokenSymbol(const AnsiString S);
void __fastcall Error(const AnsiString Ident);
void __fastcall ErrorFmt(const AnsiString Ident, const System::TVarRec * Args, const int Args_Size);
void __fastcall ErrorStr(const AnsiString Message);
void __fastcall HexToBinary(TStream* Stream);
char __fastcall NextToken(void);
int __fastcall SourcePos(void);
AnsiString __fastcall TokenComponentIdent();
Extended __fastcall TokenFloat(void);
__int64 __fastcall TokenInt(void);
AnsiString __fastcall TokenString();
WideString __fastcall TokenWideString();
bool __fastcall TokenSymbolIs(const AnsiString S);
__property char FloatType = {read=FFloatType, nodefault};
__property int SourceLine = {read=FSourceLine, nodefault};
__property char Token = {read=FToken, nodefault};
};
|
In our example, we will use TParser to break a text file up into tokens (words). Because of the nature
of TParser , this is not necessarily the best application, but it does illustrate the concepts.
Here are the methods and properties of TParser that we need to concern ourselves with:
char __fastcall NextToken(void);
Tells us the type of the next token
int __fastcall SourcePos(void);
Tells us what position in the source file we're at.
AnsiString __fastcall TokenString();
Tells us what the current token is (returns as a string).
(Other functions such as TokenInt return as different types.
__property int SourceLine = {read=FSourceLine, nodefault};
Tells us what line in the source file we're at.
__property char Token = {read=FToken, nodefault};
Tells us the type of the current token.
Here is the code for the tokenizer. It should be pretty straight-forward. It goes through an input stream (file),
tokenizes it, and dumps the results into a Memo.
void __fastcall TForm1::Button1Click(TObject *Sender)
{
TFileStream *fs=new TFileStream(Edit1->Text,fmOpenRead);
fs->Position = 0;
Memo1->Lines->Clear();
Memo1->Lines->LoadFromStream(fs);
Memo1->Lines->Insert(0,"-------------------");
Memo1->Lines->Add("-------------------");
//dump the original file
fs->Position = 0;
TParser *theParser=new TParser(fs);
while (theParser->NextToken() != toEOF) //while we're in the file
{
//Get Token
AnsiString str=theParser->TokenString();
//Get the position in the stream
int Pos=theParser->SourcePos();
//Get the line number
int Line=theParser->SourceLine;
//Get token type
switch(theParser->Token)
{
case toSymbol:
Memo1->Lines->Add(str+" is a symbol at line : "+Line+" position : "+Pos);
break;
case toInteger:
Memo1->Lines->Add(str+" is an integer at line : "+Line+" position : "+Pos);
break;
case toFloat:
Memo1->Lines->Add(str+" is a float at line : "+Line+" position : "+Pos);
break;
case toString:
//note: TParser is designed for DFM's so that toString only works with
//'single quoted' strings
Memo1->Lines->Add(str+" is a string at line : "+Line+" position : "+Pos);
break;
}
}
delete fs;
delete theParser;
}
|
|