I converted a lot of my Delphi code and most of that worked quite nicely.
But now I have to deal with the code which works with strings to parse files and such.
I am using arrays of Byte which can easily read from Streams and converted to AnsiStrings and the code looks quite ok. My approach tries to make the code compilable with Delphi so there are quite a lot of IFDEFS.
But while working with strings I found those not consistent as expected:
AnsiString is missing a Compare function and also a SubString function.
I cannot assign an AnsiString to a String easily and vice versa. Especially for file parsing it would be completely ok, if only the ASCII codepage would be copied unless a code page can be specified.
DelphiStrings also miss a lot of the String functions but on the other hand can be parameter to the good old Compare() function. AnsiStrings do not work here.
The assignment
var S2 : AnsiString := Copy(S,1,7);
seems to raise an exception if S does not have enough characters. In Delphi Copy would not cause a problem - the length will be truncated to the maximum count.
Would it be possible to make the string types and functions more interchangeable?
var S3 : AnsiString := ‘text’;
if S3 = ‘text’ then
fails with an index error.
The problem is here
operator AnsiString.Equal(Value1: AnsiString; Value2: AnsiString): Boolean;
begin
…
var i := 0;
while (Value1.Chars[i] = Value2.Chars[i]) and (i < Value1.Length) do
inc(i);
…
Hi,
the Copy function has been fixed (also remove, that had the same issue).
AnsiString now has SubString function.
These fixes are in GitHub and in the next beta.
About assigning AnsiString <-> DelphiString I’m going to take a look now.
Also, please, which functionality / functions you miss in DelphiString type?
Thanks a lot.
About comparing AnsiString with DelphiString (and operators) will work, but first need to know what is stored in the bytes. We can add add a field to AnsiString, like the CodePage, supporting now ASCII, UTF8, UTF16LE and UTF16BE.
Is this ok?
I don’t think this element would be necessary. I understand the idea behind it, but for the cases I would use an AnsiString I would only need to compare the lower 7 bits of the byte with the lower 7 bits of the unicode character. But that comparison should be as fast as possible.
If I need special characters I can still convert the AnsiString to a String (or DelphiString) while providing a code page.
Delphi is doing an on-the-fly conversion using the current codepage which kind of works, but causes problems when a program is run at a different locale. Having an additional field in the AnsiString a bit contradict this type to be small and effective. If for each comparison the string is decoded I expect it will seriously slow down parsing logic.
The Stream has a method ReadBytes(Count: LongInt): TBytes; - I use that to read an AnsiString from a file. Is there something similar to read a String or DelphiString from a file?
Ok, I see, let’s keep simple then, without conversions on the fly that can slow the comparison operations.
About a read method for String / DelphiString there isn’t right now a method, I’m going to add and will be available for next beta a couple of TStream methods:
method ReadString(Count: LongInt; aEncoding: TEncoding = Unicode): Delphistring
method WriteString(Bytes: array of Byte; aEncoding: TEncoding = Unicode):LongInt
Have more
The + operator fails if the first string is not initialized.
I included these changes to make it work.
constructor AnsiString(Value: PlatformString; AsUTF16Bytes: Boolean := false);
begin
if (Value=nil) or (Value.Count=0) then exit;
method AnsiString.Insert(aIndex: Integer; aValue: AnsiString): AnsiString;
begin
if fData=nil then
begin fData := aValue.Data; exit; end;
method AnsiString.Insert(aIndex: Integer; Value: AnsiChar): AnsiString;
begin
if fData=nil then
begin fData := new Byte[1]; fData[0] := Byte(Value); exit; end;
I can not reproduce this one, please. could you post an small test case and platform?
Ok, I can not reproduce the issue because the get_length is fixed, so when the ansiString is not initialized, returns 0 as length.
But for y.Insert(0, x) still fails (when y is not initialized), so fixing!