Possibility of stackallocated Classes

LazarusRisen · December 23, 2020, 10:02pm

Heyo RemObjects-Team!

there is the possibility In C++ to create a Class on the stack (they are by default always on the stack) and still enable full inheritance, polymorphie, operator overloading etc…

but if you need them to be on the heap, you must declare a pointer/Reference to that Class. FPC/Delphi have smth similiar which are called “objects” which I sadly miss in Oxygene, since having the power to express a datastructure everywhere in memory is really really powerful especially when u work on more low-lvl stuff (like myself right now with an engine im involved at) to have proper optimization control, ,and ALSO, I would love to add, if you would consider making the effort to allow the newly released “Management Operators” from Delphi to define how records are accessed, assigned and finalized. Also a very important low-lvl feature. IF you consider implementing those, I would really love to see them find their way also to the “stackbasedClass” type, if you also decide to implement this, since as I mentioned, it allowes for a lot of nice and important shinenigans

So to summarize:

Implement a new Delphilike-object to store a class on stack (but not with the class keyword ofc and the usual support of
interfaces, class-operators, polymorphism, management-operators(see delphi example), move-semantics(C++))
Allow management-operators(see delphi example), move-semantics(C++) also for records

Not gonna lie, if this could find the way to Island, I would be a happy panda, since this enables a big box of potentials for High optimizations within Island! And it would be a nice ChristmasGift xD

PS: some may know them, but for the ones who do not: std::move is a powerful intrinsic which allows the faster realloc of data when the “assign” operator is overloaded for a custom class/record then in delphi where you can say that if a class contains a string and u dont want that string to be copied everytiome u make a copy of that class to another instance (im talking now in case of C++…) you just overload the “assign” operator and use the std::move(myString) to move it to another location, so u dont have to copy each time a custom-length string which is much more performant than a copy.

I hope I could explain all that somewhat decently and that atleast it raises some attention

Have a blessed week/time/Christmastime!

Sven

PS: for more information, i have read about an already existent thread in the FPC forum, apparently it is discussed there aswell as i found out, for more information look: How optimized is the FPC compiler (freepascal.org)

mh · December 23, 2020, 10:32pm

Probably a bit late to get this in in time for Christmas

Kidding aside, I’m not sure if this is technically feasible on all platforms (.NET, Java, Cocoa might all not support this on runtime level) but it’s worth considering; I log a feature request for review, tomorrow.

No promises or timeline.

LazarusRisen · December 24, 2020, 8:09am

Hey Mark thx for the quick answer

First want to thank you for the consideration itself, pretty welcome. Second, since C++ is also based on LLVM as a backend, (and if im not mistaken only the backend matters here actually, it should not matter which Runtime it ports to since:

The old “obejct/stackClass” type from FPC and Delphi are only so called “deprecated” because of the old stacksize limitations to 1MB, which is ofc tiny. So I think back then, when they were thinking that it would be causing memory issues they abadon mostly the idea for any further use and went sofort to the heap to do all the major part but allowed of course stackallocated-easy-objects, called records to implement interfaces, operators (hopefully at some point also the new managment operators). But since this limitation is kind of deprecated, they can be used again which a quiet decent amount of FPC devs are already doing for computational purposes where they even have polymorphism
TMyStackObject = object(TMyObjectBase, IInterface1) and also are using operator overloading, and all that in stackframes. So I think this should be feasible also for VM-based Runtimes like Java/C#/Cocoa since they did the same thing with their object model due to that stack limitation back then, and on top they were really never designed from scratch to be used for much efficiency, in first place so its understandable they didnt make use of such constructs which is another topic tho for Island which I am happy about to say the least. But yea I think it shouuuuld be possible to mimic it for .NET/JVM/Cocoa, would love to hear from you if and how the topic is handled by you guys

mh · December 24, 2020, 11:32am

Well, we only use LLVM for the native backends; .NET and Java impose their own runtime object model (and we also don’t use LLVM for them, but thats besides the point, really), and the same is true for Cocoa objects.

For Island, it would definitely be doable — b ut then it becomes a question of do we wanna do it if it cant be on all platforms — probably not. That said, this is all above my pay grade, and just a guess. the compiler team will need to have a look, in the new year

LazarusRisen · December 24, 2020, 11:49am

Yea i know how u mean that, makes sense of course. Tbf, I really hope it is possible for all platforms somehow (yea i have seen C++ alllows them because their compiler Frontend Clang allows this abstraction for every native platform, which is based on LLVM, and yea clang is developed by shit ton of ppl since 12 years at only 1 compiler frontend) so ofc there are other capabilites along with that why its possible. But maybe maybe, the smart compiler Team will find a way

Patrick · December 24, 2020, 1:37pm

Hello Sven,
.NET doesn’t allow this: value types (primitives, such as Int32, Double, …) are on the stack and reference types (classes) are on the heap. Even String is considered a reference type.

Reference types - C# Reference | Microsoft Docs
Value types - C# reference | Microsoft Docs
Difference between a Value Type and a Reference Type

LazarusRisen · December 24, 2020, 3:13pm

Hi Patrick,

im fully aware of the object model of .NET and Java but my idea was maybe maybe they can implement some sort of mapped-object model for the class-stack-type also for those platforms since the code is only shared with remobjects .nET variants not with oracles or microsofts one, afaik. And if that true, then they may introduce this type maybe also for their other platforms, depends if compiler team says smth about this, im kind of awaiting their response

LazarusRisen · December 26, 2020, 11:35am

@mh
Actually i have looked abit more into Islands/Oxygene OOP model and it looks like that your record-type allow for interface implementation, operator overloading, so the only thing needed right now in regard of this topic-future-request is:

Polymorphism - records having the ability to inherit from other records (not classes tho) and also with that, declaring a record as abstract, its members virtual and ofc with that overriding in inherited records. And yea also the "std::move" semantic from Cpp in a nicely oxygene way, would complete this course .

Really would love to hear @ck opinion on that topic .

ck · December 29, 2020, 8:38am

Sorry I missed your question. So actual record inheritance isn’t hard to do and a curious concept that could even Work on all platforms. What I am curious about is what you want to accomplish with move semantics?

LazarusRisen · December 29, 2020, 9:47am

Heyo Carlo,

glad to hear from you; 1.)

This I endorse alot would be definitly huge.

Its more a thing a friend of mine I know, (he also complained in FPC forum over that haha) that he was about to write a CryptoLibrabry and encounterd that (in that case fpc) doesnt allow to produce nice and easy readable code for the same efficiency that C++ allows, and he asked me if I can ask you guys if oxygene allows for such constructts to be written as (code-efficient and ofc performance wise) in oxygene as well: Here is something we have discussed 1 week b4.

Quote:
“The problem here is that as soon as you use operator overloading, the amount of copies is really annoying. For example I wrote a gmp wrapper recently, here are some parts of the code:”

class operator TAPInteger.Finalize(var a: TAPInteger);
begin
  mpz_clear(a.FData);
end;
 
class operator TAPInteger.Copy(constref aSrc: TAPInteger;
  var aDst: TAPInteger);
begin
  mpz_set(aDst.FData, PAPInteger(@aSrc)^.FData); // deepcopies the whole data
end;
 
class operator TAPInteger.+(constref lhs: TAPInteger; constref
  rhs: TAPInteger): TAPInteger;
begin
  mpz_add(Result.FData, PAPInteger(@lhs)^.FData, PAPInteger(@rhs)^.FData);
end;

The following expression c := a + b; would create a temporary object that is used as result of the + operation, which is then copied into c using the copy operator.
I wanted to implement some crypto algorithms just out of interest, i.e. not write production ready code, therefore the copies are not a problem, but if you wanted to deploy this code in a server that has to be quick when establishing handshakes and stuff (as each copy would need to copy around 500 bytes), this would be a problem.
The C++ OOP implementation of the gmp uses for this the move semantic, i.e. a temporary object gets created, but the assignment to c only copies the pointer not the whole data.
To do this in pascal you would need to completely go without operator overloading. And personally I think this makes the code much worse. Just look at the RSA key generation: The below is how it would be done in C++, Sven:

1. p := TAPInteger.RandomPrime;
2. q := TAPInteger.RandomPrime;
3. m := p * q;
4. phi := (p-1) * (q-1);
5. pub := 65537;
6. priv := pub.inverse(phi);

this is much better than writing the folliwing code in pascal/delphi:

1. mpz_init(p);
2. mpz_init(q);
3. mpz_init(m);
4. mpz_init(phi);
5. mpz_init_set_ui(pub, 65537);
6. mpz_init(priv);
7. generateRandomPrime(p);
8. generateRandomPrime(q);
9. mpz_mul(m, p, q);
10. mpz_init(phi_p);
11. mpz_init(phi_q);
12. mpz_sub_ui(phi_p, p, 1);
13. mpz_sub_ui(phi_q, q, 1);
14. mpz_mul(phi, phi_p, phi_q);
15. mpz_clear(phi_p);
16. mpz_crear(phi_q);
17. mpz_inverse(priv, pub, phi);

In C++ you can write code like the former with literally no drawbacks, in pascal if you need performance, you need to write the latter one if you don’t want to loose a lot of performance due to copying.
My point is that even considering this, it is missing the move semantic. Meaning you often copy data from temporary objects, instead of just grabbing their pointers. As these objects are temporary, they don’t need that pointer afterwards, so you can save a lot of performance by doing so.

 c++ code:
1. std::vector<int> v1{1,2,3,4}, v2;
2. v2 = v1; // copies all data from v1
3. v2 = std::move(v1); //makes v1 a temporary object, moves list from v1, v1 is now empty and v2 contains all the data from v1.

For example the result of a function is always a temporary object, meaning there is no point in copying data if you can move it instead

It should be noted that besides this C++ compilers generally often use return value optimization, so instead of creating a temporary object tha is returned by a function that is then moved or copied, the compiler will simply write into the target object if possible, ommiting any move or copy, which is an optimization the FPC could also greatly benefit from. But even without the move semantic makes handling complex datatypes via copy assignments much easier

That was our small interaction back then, and I share his point of view based on those things, I have to also point out that oxygen-Island/Cocoa sadly doesnt yet allow for the copy-assign operator aswell so you cannot overwrite “:=” operator as you can now in delphi and FPC 3.2.

So all in all, having the “Assign/Copyoperator (:=)” and the to that related move-semantics or move-functions (depends how you would do it, if you decide to) can allow for alot of internal optimizations without making code disgusting to read or write

Again, I would be intrigued to know your opinion on this entire topic and also to you, have a blessed week!

LazarusRisen · December 29, 2020, 9:56am

I want to add to a common mind-case of ppl where they say, "pls do not turn pascal into C++" Which in overall I understand and feel same, but on the other hand we are talking here over a platform which uses LLVM as backend and endorses a lot of other freedom in doing with memory what you can, like raw pointer, alot of different passing-by-ref options, more clearer code (which just by its passive nature allows for more readable and due to that more efficient code!) etc… so endorsing the fact that C++ has some stuff it does really good and the idea to port them over to a platform which endorses in theory SOME of those good stuff it has, is imho not a bad practise/thing to do, ofc I do not want to turn oxygene/Island into the next C++_ish abomination of a language, but as I said, some things which makes sense and are not to shabby, could be done, ofc its up to the compiler guys (if they decide to do it) WHEN these features arive since I know that there are other stuff to do or to prioritize b4 any features, I more or less functioning right now as a feature-messenger

ck · December 29, 2020, 2:35pm

Oh I don’t disagree. But maybe we should start with record inheritance, then see if the existing operator logic won’t suffice? I’m currently out of the office but we already have copy ctors, assignment operators, initializing ctors and more. Maybe we can use the existing ones to accomplish what we need, else we can always look at adding things, where needed.

LazarusRisen · December 29, 2020, 3:05pm

That’s a good idea btw, if you implement record inheritance would they still allow an inheritance of records who themselves implement interfaces, just to be sure, otherwise it would be bit lame if only plain records can inherit plain records haha. To the other, is oxygen currently able to implement copy constructors and. Overloading of assign operator can u make a compilable example? I couldn’t find the doc to that

And yea ofc, maybe me/he find an also cleaner way to replace move-semantics, which wouldnt breck the neck if not included, but for sure if you find your time for this at some point, think about it, but dont hurry, as I said, otherstuff have prio

ck · December 30, 2020, 6:16am

Of course. Wouldn’t make sense otherwise.

Oke so special operators/methods for records:

Finalizer Called when it needs to be freed
constructor() Parameterless constructors are called when entering the scope (unless it’s assigned directly from another place, or another constructor is called)
constructor Copy(var aValue: OriginalType) Called when initializing from another value; should presume “self” is uninitialized.
class operator assign(var aDest: OriginalType; var aSource: OriginalType); Copies aSource into aDest.

LazarusRisen · December 30, 2020, 6:30am

This makes me indeed smile, especially the last 2, since the former 2 I was aware of, but operator and copy-ctor wasnt in my mind, nice

ck · December 30, 2020, 6:56am

But how would a move constructor/operator look like in Oxygene, and what would it add, over the above?

LazarusRisen · December 30, 2020, 7:55am

I would rather do it then via a move-Constructor who looks exactly like Copy Constructor and allowing the same sort of syntax as c++ without the “std::” and instead of move(I think this name has a lot of other meaning nowadays due to C) so I would it better not call it move, but maybe more MoveRef: or smth the like, just to prevent other associations which doesnt share the same intent.

And what this allows us to do is, I take the example of my friend above where he is writng custom SetOperations for his CryptoLibrabry, considering the usual set operation
res := set_a + set_b //this would without the move-semantic, “a+b” creates a temporary as a result of that operation and then is deep-Copied into “res” and instead of making a deep copy to “res”, we can just pass the pointer to “res” since the temporary will disappear, this is the rValue reference which will be just passed to the “res”(lValue)

So that’s basically what its good for, to avoid just unnecessary copies (especially if they happen to be very frequently, since when you know a certain object is going to be disposed anyway, but you need its resources for something else (like in the example above, even if its not the best example) you should be able to not recreate it if you just can steal its resources, you know what I mean.

ck · December 30, 2020, 7:59am

constructor Move(var aInput: ^InputType);

Which would, IF it exists, allow setting “self” to aInput’s value, discard the original aInput.

which would be called for:
var x := somethingThatReturns;

And an operator to match it:
class operator Move(var aDest: InputType; var aSource: InputType);
Which would be called for:
x := somethingThatReturns;

I suppose.

LazarusRisen · December 30, 2020, 8:01am

you nailed it, this would do the entire move-trickery

the only thing maybe if you want to consider what i mentioned here:

like we have memmove, then Move(…) in C# but all those do not do what your operator/would do, so I would highly suggest to better not call it Move() since ppl would still think subliminally it does sort of the same as memmove I fear. Like more “MoveRef” or “MoveAssign” or the like. But anyway, anyway how u decide, good thing you consider it and thx alot

RemObjectsSoftware · December 30, 2020, 8:03am

Thanks, logged as bugs://85361 (move semantics)