Skip to main content
Topic: moving utils/cstring_t to std:string C++ (Read 9817 times) previous topic - next topic

moving utils/cstring_t to std:string C++

Hi,

I started looking at the simutrans code, and decided to clean up a bit.  At first I was too ambitious, and I tried to do too many things at once, but I'm able to learn.  ;)
My long time goal would be to get rid off most of the code in the utils/* directory.  For now I have looked at the cstring_t-cl****.  I have converted it into a wrapper-cl**** for std::string.  I hope you are interested in these changes.  I would then gradually move from cstring_t completely to std::string.

If you are interested, please double check the code before applying it, I think I ported everything correctly, but one never knows... there was a single thing I didn't really understand:  The method "void set_at(int idx, char) const;" is declared "const", but it does change the string (!?), and for some funny reason the compiler doesn't complain...

Looking forward to any comments.  Thanks

Klaus

Re: moving utils/cstring_t to std:string C++

Reply #1
The method "void set_at(int idx, char) const;" is declared "const", but it does change the string (!?), and for some funny reason the compiler doesn't complain...
The cstring_t contains only a pointer, this pointer is not modified, so 'const' is correct.
Parsley, sage, rosemary, and maggikraut.

Re: moving utils/cstring_t to std:string C++

Reply #2
The cstring_t contains only a pointer, this pointer is not modified, so 'const' is correct.

I thought the C++ standard didn't allow any member variables to change (and not even the memory pointed to by a constant pointer).  I see (of course in my opinion this means that the  standard is messed up. ;) ).

Klaus

Re: moving utils/cstring_t to std:string C++

Reply #3
There is a distinct difference between a pointer to a constant and a constant pointer.
const int* a = &some_integer;
int* const b = &some_integer;

*a = 12;  // Compiler should complain.
*b = 8;  // No problem here.

a = &other_integer;  // No problem here.
b = &other_integer;  // Compiler should complain.


See Stroustrup's Technical FAQ for details:   www.research.att.com/~bs/FAQ2.html
I think that's the correct URL, otherwise remove the webpage suffix and look for the Technical FAQ yourself.

Edit:  The URL is http://www2.research.att.com/~bs/bs_faq2.html#constplacement.

Re: moving utils/cstring_t to std:string C++

Reply #4
There is a distinct difference between a pointer to a constant and a constant pointer.

a = &other_integer;  // No problem here.
b = &other_integer;  // Compiler should complain.


I understand, and I agree now:  A 'const' declaration for a method declares that the cl**** (i.e. all of its member variables) will be constant for this function call.  Formally this is (now) very clear.

Intuitively though I considered the memory pointed to by a member variable of a cl**** to be part of that cl****.  So that if a method doesn't change a cl****, it shouldn't change the memory pointed to by this cl**** either...

For example here in simutrans you had a cl****

cl**** cstring {char *_str; };

Intuitively, I feel that implementing _str as a pointer, is really just a work around that should mean that the string pointed to by _str is part of cstring, and if you declare a "const cstring A;"  you should not be able to change content of _str either.  Of course, formally this is not true.

Thanks for the explanations and clarifications.

BTW Did anybody try the changes I proposed?  This weekend, I have played simutrans  with my patch  for several hours, and I didn't detect any problems.  I don't know if all platforms on which simutrans runs support the std-C++ library, but I think that it would clean up the code, if we ported simutrans gradually from cstring_t to std::string, from vector_tpl to std::vector etc...

Cheers Klaus

Re: moving utils/cstring_t to std:string C++

Reply #5
First:
Simutrans code is older than std ... thus is was not used when the programming started.

cstring_t could certainly go, but then really go and not via a wrapper. Making a wrapper for an existing working cl**** is imho not useful. Rather then the code calling cstring_t needs to go (or even converted to other stuff, if it never changes in const char * or cbuffer_t if it length should be limited etc.)

About cbuffer_t I am not so sure, since this is with a constant length and enforcing it. If there is a standard type, it can go. Not sure if there is something to win.

And simutrans needs to compile on any plattform. Also on Haiku/BeOS which is still GCC 2.95.xxx resp. the BeOS CC, which has only the part of std, that was available 1997 ...

Re: moving utils/cstring_t to std:string C++

Reply #6
cstring_t could certainly go, but then really go and not via a wrapper. Making a wrapper for an existing working cl**** is imho not useful. Rather then the code calling cstring_t needs to go (or even converted to other stuff, if it never changes in const char * or cbuffer_t if it length should be limited etc.)

I completely share this opinion.  The conversion to a wrapper cl**** was only meant as an intermediate step to really understand what was going on in each method before starting to change dozens of files that use cstring_t...  I believe that this reduces the probably of introducing mistakes...

Quote
About cbuffer_t I am not so sure, since this is with a constant length and enforcing it. If there is a standard type, it can go. Not sure if there is something to win.

I didn't look at that yet, but I think for example that the vector cl**** template can be removed.

Quote
And simutrans needs to compile on any plattform. Also on Haiku/BeOS which is still GCC 2.95.xxx resp. the BeOS CC, which has only the part of std, that was available 1997 ...

I have have no way of testing whether any new code compiles on BeOS.  :-(

Klaus


Re: moving utils/cstring_t to std:string C++

Reply #7
If std contains a weighted vector, then the vector cl**** can go. Same in principle for the list cl****; but simutrans uses its own much faster memory managements, thus the list cl**** should use this then.

Re: moving utils/cstring_t to std:string C++

Reply #8
prissi, the closest std container to a weighted vector would be the std::priority_queue.  The problem, if I recall correctly, is that access is only allowed at the ends and not within the middle.  This does not change even if the underlaying datatype, incidentally a deque by default, is specified otherwise in the template parameters.  Although deriving from std::vector and overloading the insertion/deletion operations to allow a priority comparison may be a solution.  Also consider that the elements within std containers have copy semantics, and not move semantics.

Re: moving utils/cstring_t to std:string C++

Reply #9
Well the weighted_vector_tpl with middle access is in the most performance critical parts: P****enger generation and distribution. Changing this is a bad idea. I would then just keep the vector_tpl. You can use sdt:: like iterator anyway with it.

Re: moving utils/cstring_t to std:string C++

Reply #10
prissi:  If it ain't broke, don't fix it.  The only exception is perhaps to update vector_tpl to be allocated more like std::vector (perhaps) and (definitely) to support C++ exception handling.  Of course, two debates come from this:  Whether to support exception handling and whether the modification is too resource (time, space) demanding. 

I think I recall reading in Stroustrup's FAQ that adding exception handling to a program is not expensive, only increasing the size of the executable by 3% and perhaps even being slightly faster in execution time (Saves time by not constantly checking for error values).  It was also frequently mentioned to replace the use of a std container if there is a particular need to be satisfied, such as the middle random access that you have stated.

Re: moving utils/cstring_t to std:string C++

Reply #11
Instead of adding exception handling, I would like to have an automatic generation of backtraces. This would help debugging the fatal errors for out-of-bound index.
Parsley, sage, rosemary, and maggikraut.

Re: moving utils/cstring_t to std:string C++

Reply #12
Dwachs:  Could be done with exception handling:  By catching the error, posting to a log the particular function that the catch block is for, and then rethrowing the error. 
This is not saying that it could be done without exception handling, but it would be a lot easier with the handling.  It might be interesting to note that std::vector supports the at() member function for random access with bounds checking, and the typical array-like [] syntax is for unchecked access.  If the programmer knows that the index is within bounds, then the [] access does not incur a penalty.  Perhaps adding this checked access to certain STsource/utils containers would be worthwhile?

I'm pondering splitting the heightmap loading into two functions, to be a nested calling situation, and also using the std::ifstream instead of the old File datatype.  This way, if an error occurs, there might be a chance of recovering from the point of the error in the data stream, instead of an "all or nothing" approach.

kniederk:  Congratulations upon attempting the general overhaul to std::string.  Only in few instances can someone have too much encouragement!   ;D

Re: moving utils/cstring_t to std:string C++

Reply #13
Using -debug 4 as debug level, a division by zero is created in case of fatal errors, which produces a backtrace. I am using this technique since years, (especially since on Win98 gdb did not work and I had to rely on dbg->message and backtraces) and it was already part of the original code from Hajo.

Execption handling is very bad performancewise. Furthermore, any out-of-bound is fatal: This is a program error and needs fixing.

[] versus at(): If the optimiser is clever enough, the range checking will be done only once before an iteration. templates are like implicit inline code, sicne they are only in header files. And for a single access, range check is not much overhead. (Java does range checking all the time, od?) Other than that, better save then sorry: Usually programmer ****ume that they do not err and would always use [].

std::ifstream does not work properly on the BeOS compiler last time I checked; furthermore, the only way to recover from broken files would be for plain ASCII XML-files. Any of the other compressed formats will be garbage if a bit flips. (But I am not sure, I got your argument right.)

Re: moving utils/cstring_t to std:string C++

Reply #14
Hi,

Now instead of using a wrapper with std::string as a member, cstring_t inherits directly from std::string.

Still the (not so) long-term goal is to get rid of cstring_t completely, but I just didn't want to risk changing dozens of files at once.

So if you could give it a try or even compare the code, and tell me if everything works as desired, I would be most delighted. ;)
And if everything is fine, I wouldn't mind if you already checked it in into the svn-server,  so that I would know to which point I could come back, if future changes do not work.

Thanks

Klaus

Re: moving utils/cstring_t to std:string C++

Reply #15
I would rather suggest to change the code directly. cstring_t is anyway only used in very few places.

Re: moving utils/cstring_t to std:string C++

Reply #16
Hi Prissi,

I know that I will get on your nerves with this.  I'm almost there, cstring_t can be almost removed if it were not for the cast-operator to 'const char *' which does not exist for std::string.  Otherwise all methods of cstring_t are equivalent to the ones of string.
I started getting rid of the cast-operator too, but the first time I messed up, because this is used implicitly in _many_ files.

So I would _really_ like the patch included here to be committed to the SVN-repository, and then I can proceed more carefully.  And I have a question about cbuffer_t.  What is its purpose, and why can't it just be exchanged by cstring_t or std::string?

Cheers

Klaus

 

Re: moving utils/cstring_t to std:string C++

Reply #17
cbuffer_t uses a fixed preallocated buffer. Especially when building a dialoge at every display steps, lots of copy actions during appending of stuff (like for p****enger destinations) is quite time consimung. Simutrans uses just a constant length buffer, which avoid lots of those.

But as I said, I want to avoid to add a wrapper cl**** for something not broken at all. This just combines the disadvantages of an own cl**** (which can inline code) and a library (which I need to trust and most often cannot inlined).

Furthermore, I am not very please by braindead naming. npos is giving then length? Why on earth this isn't a proper function in std:string and why is it named npos? According the the reference, shouldn't length() return this information? (Is npos public at all on standard implementations?) At least this say c++.com:
ttp://www.cplusplus.com/reference/string/string/

Re: moving utils/cstring_t to std:string C++

Reply #18
I could understand that you would not trust my programming skills ;-), but not trusting the stdlib???

It's not that I want to keep the wrapper cl**** for more than one or two weeks anyway, it's more that I fear that I will produce a very large patch changing many files, and things may happen (because right now a lot of casting is being done implicitly, and this has all to be made explicit now).  If a bug were to pop up (which won't happen because I'm very disciplined haha), there will be a huge patch, and tracking the bug down will be more work, but you are the boss. ;-)

Klaus

Re: moving utils/cstring_t to std:string C++

Reply #19
Well, you are already changing almost all needed functions, so there is very little change to make the few cast just explicit c_str. Especially, since those casts were mostly anyway explicit cast with (const char *) for printf etc.

The SVN should be more for rather finished stuff. If there are errors, usually they are spotted quite fast, nowadays. There are luckily many more than just my poor eyes to look at the code.

Not trusting std: On BeOS the c-lib is older than the the std implementation, and it is not a Unix OS! Not everything is GNU/Windows ... I still have to see what subst is implemented there (although I think it is complete, since it has a similar cl**** in its OS before.) I meant it this way: One can heve an error in the wrapper, and you loose inlining, that is why I do not like wrappers.


Re: moving utils/cstring_t to std:string C++

Reply #20
Here is a new patch.  cstring_t.cc is empty and can be deleted.  cstring_t.h still contains a few macros that are used at other places, and I didn't know where to put them, otherwise it can also still be deleted.

I compiled and played the game a bit, and everything seems to work (on Linux).  I don't know about besch/writer/* though.  These files are not compiled, and I don't know what their use is.

Quote
Well, you are already changing almost all needed functions, so there is very little change to make the few cast just explicit c_str. Especially, since those casts were mostly anyway explicit cast with (const char *) for printf etc.

There were many implicit casts everywhere, but I believe that in a second step of purification, many function calls can be modified now.  I find the worst so far is the loadsave.h which I would like to rewrite partially.

Test this patch as much as possible, before checking it in, since it is quite large.

Cheers

Klaus

Re: moving utils/cstring_t to std:string C++

Reply #21
Hi Klaus,

I don't know about besch/writer/* though.  These files are not compiled, and I don't know what their use is.

Those files are for compiling pak files for the paksets. If you look into the trunk/makeobj folder, you should see another makefile for makeobj. Hope this helps!

Knightly

Re: moving utils/cstring_t to std:string C++

Reply #22
Furthermore, I am not very please by braindead naming. npos is giving then length? Why on earth this isn't a proper function in std:string and why is it named npos? According the the reference, shouldn't length() return this information? (Is npos public at all on standard implementations?) At least this say c++.com:
http://www.cplusplus.com/reference/string/string/
prissi:  Both std::string member functions length() and size() follow the intuitive result of returning the length of the specific std::string being used; if the string has zero length then a value of zero is returned, and similar for any other amount of characters for the particular string.  Both length() and size() are interchangeable, with size() being consistent with std containers in general, but length() is typically preferred for strings.
You seem to be misunderstanding the reason for the constant value string::npos, which is used for searching within a string.  It did not make sense for the return value to be zero if the string did not contain the requested substring, so another value was required.  By placing std::string::npos to be the largest possible size for a string and nearly impracticable for any sensible useage, it relieves the programmer from having to double check a particular, possibly correct, search position.  The naming of npos was most likely due to the typical enormous committee size notion of trying to get everyone in agreement; someone must have been adamantly opposed to the more common-sense name for it.
You might consider checking out the find(), find_first(), find_last(), find_not_of() and similar search functions, which are quite nice to have available.

Re: moving utils/cstring_t to std:string C++

Reply #23
Yeah, I found my mistakes already yesterday. You know, I learned C/C++ before std:: was there (even before the stream libraries) thus I apologize for misunderstanding. Of course the naming of std is out of your control; they have lost of other not very intuitive names; but then it is standard and everybody is supposed to know them nowadays (but me ... )

Re: moving utils/cstring_t to std:string C++

Reply #24
Hi!

I just skimmed over this diff and I noticed a few places where a tab was replaced with some number of spaces (for example tab,tab,tab,space,space) causing broken indentation.

In the diff around lines:
~880
~1340..1392
~3192

You should really use visible whitespace in your editor, or some other method of checking for these.

Re: moving utils/cstring_t to std:string C++

Reply #25
Ok, another version.  I type 'make' in the makeobj-directory, and everything compiles.  Still I don't know if everything works, but if I understood correctly, the purpose is to produce some pak-files, so maybe somebody could just use my version and diff the files produced by it, with the files produced by the original one.

If somebody wants to help out with the indentation, I would be very happy, I don't know how to do these things right.

Klaus

Re: moving utils/cstring_t to std:string C++

Reply #26
diff -bB ignores whitspace changes. Not sure how to handle it within svn though.

Re: moving utils/cstring_t to std:string C++

Reply #27
svn diff -x -bB
Parsley, sage, rosemary, and maggikraut.

Re: moving utils/cstring_t to std:string C++

Reply #28
This command doesn't work on my computer.  I invite somebody here to fix the indentation problems for my patch, and also to decide where to put the macros that are still left in cstring_t.h.

Thanks

Klaus

Re: moving utils/cstring_t to std:string C++

Reply #29
There was some code missing for searchfolder, otherwise it is fine. I will think about the macros.