Skip to main content
Topic: Infinite loop (Read 4916 times) previous topic - next topic

Infinite loop

This seems to occur when trying to get the "Load game" window, if you have two paks, one of which is a prefix of the other
(For instance, pak.britain.ex and pak.britain.experimental), and you're running the one with the longer name (pak.britain.experimental), in Bernd's sorting routines.

I haven't quite been able to track it down, but I'm guessing that an inappropriate value is being returned from the comparison operator to the sort algorithm (e.g. returning 'identical' for two different strings).

Re: Infinite loop

Reply #1
Aaagh, left out important information: I believe this is in Bernd's new work, this is not in 7.3

Re: Infinite loop

Reply #2
I can now generate a crash by having old save files with pak name "Mental" and a new pak named "Mental-3".

There's clearly some serious problem in handling names where one is the prefix of another.  If no pak name is the prefix of another, everything works.

Re: Infinite loop

Reply #3
Very strange. If've got paks "pak", "pak128", "pak.german", "Pak128.Britain-Ex", "Pak128-Britain-Ex0_4", "pak128-nightly", and some more, but no problems.

I'll check with less than 3 paks...

But it works. Both Visual C++ and gcc version...

Could you please post a backtrace.
The journey is the reward!

Re: Infinite loop

Reply #4
Very strange. If've got paks "pak", "pak128", "pak.german", "Pak128.Britain-Ex", "Pak128-Britain-Ex0_4", "pak128-nightly", and some more, but no problems.
Do you have saved games from all of them?  It seems to relate to the code for sorting the saved games in the load/save window.  It could also relate to implementation-dependent behavior of strcmp, or something like that.  EDIT: It could also relate to having saved games for a deleted pak?

Backtrace for the second problem (the infinite loop hasn't been easy to reproduce lately, so might be fixed):
Code: [Select]
Program received signal SIGSEGV, Segmentation fault.
0x08141ede in gui_file_table_pak_column_t::get_text (this=0xd6527a0, row=...)
    at gui/loadsave_frame.cc:185
185 return strlen(pak) > 3 && (!STRNICMP(pak, "zip", 3) || !STRNICMP(pak, "xml", 3)) ? pak + 3 : pak;

#0  0x08141ede in gui_file_table_pak_column_t::get_text (this=0xd6527a0,
    row=...) at gui/loadsave_frame.cc:185
#1  0x081434f7 in gui_file_table_pak_column_t::compare_rows (this=0xd6527a0,
    row1=..., row2=...) at gui/loadsave_frame.cc:169
#2  0x080f858c in gui_table_row_list_t::compare_items (this=0xd6524a0,
    item1=0x69, item2=0xd653348) at gui/components/gui_table.cc:315
#3  0x080f8e92 in list_tpl<gui_table_row_t>::qsort (this=0xd6524a0, l=0, r=12)
    at gui/components/../../tpl/list_tpl.h:388
#4  0x080f8fa5 in list_tpl<gui_table_row_t>::qsort (this=0xd6524a0, l=0, r=16)
    at gui/components/../../tpl/list_tpl.h:398
#5  0x080f9025 in list_tpl<gui_table_row_t>::sort (this=0xd652380)
    at gui/components/../../tpl/list_tpl.h:468
#6  gui_table_t::sort_rows (this=0xd652380) at gui/components/gui_table.cc:263
#7  0x0815e82f in savegame_frame_t::fill_list (this=0xd651e60)
    at gui/savegame_frame.cc:191
#8  0x0815e958 in savegame_frame_t::infowin_event (this=0xd651e60,
    ev=0xbfffd484) at gui/savegame_frame.cc:531
#9  0x08244ca3 in create_win (gui=0xd651e60, wt=1 '\001', magic=11)
    at simwin.cc:448
#10 create_win (gui=0xd651e60, wt=1 '\001', magic=11) at simwin.cc:384
#11 0x08216bfc in simu_main (argc=1, argv=0xbffff434) at simmain.cc:1049
#12 0x082a43ca in main (argc=1, argv=0xbffff434) at simsys_s.cc:743

Re: Infinite loop

Reply #5
Do you have saved games from all of them?  [...] EDIT: It could also relate to having saved games for a deleted pak?
I took these names from the Load dialog. Not all names refer to an installed package.

In the call stack there is an invalid reference to item1 (0x69) at level #2. Thus the seg fault is not originated in gui_file_table_pak_column_t::get_text(), but in or prior to list_tpl<gui_table_row_t>::qsort(). I guess it is in list_tpl<gui_table_row_t>::qsort(), which would fit best to the infinite loop as well.

I wish I had your savegames...

You've got 17 savegames. How many "Mental" and how many "Mental-3" savegames?

For a reproduction, it would be helpful to now the order in which they are loaded. Could you please compile a Simutrans Experimental without sorting (remove the call to sort_rows() at gui/savegame_frame.cc:191). If it still crashes, qsort() is absolved, otherwise please post the resulting dialog. Thank you.
The journey is the reward!

Re: Infinite loop

Reply #6
Yes, it doesn't crash (there!) with the sort code turned off.

Here's the screenshot of the load window (zipped):
http://files.[ simutrans [dot] us (site down, do not visit) ]/files/get/HlGSC3WMe7/simscr13.zip

 

Re: Infinite loop

Reply #7
Unfortunatelly I still cannot reproduce the bug. I added some debug output, which should help to find the error.
Would you please merge from my master and execute SE with -log -debug 4 and post the qsort relevant lines of the resulting log?
Thank you.
The journey is the reward!

Re: Infinite loop

Reply #8
Code: [Select]
Debug: 20gui_table_row_list_t.qsort(0, 18):	enter
Debug: 20gui_table_row_list_t.qsort(0, 18): i = 0, pivot = data[p = 9], j = 18
Debug: gui_file_table_pak_column_t::compare_rows(): "3Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[i = 0] >= pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "3Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 18] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "ModernMental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 17] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "3Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 16] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 15] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 14] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 13] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 12] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 11] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 10] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 9] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Pak128.Britain-Ex" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 8] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Pak128.Britain-Ex" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 7] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 6] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Pak128.Britain-Ex" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 5] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Pak128.Britain-Ex" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 4] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Pak128.Britain-Ex" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 3] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Pak128.Britain-Ex" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 2] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "Pak128.Britain-Ex" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 1] > pivot
Debug: gui_file_table_pak_column_t::compare_rows(): "3Mental" > "Mental"
Debug: 20gui_table_row_list_t.qsort(0, 18): data[j = 0] > pivot
Segmentation fault

The segfault is because you aren't checking your bounds before array indexes.  *Always* check your bounds before array indexes.  Patch for bounds checking attached. :police:  This returns us to the infinite loop.

Obviously, there's a second problem here -- "Mental" > "Mental"?  I don't think so.  And now that I know that's the problem, I think I know what the error is:
Code: [Select]
        char s1[1024];
        strcpy(s1, get_text(row1));
        float f1 = strsim(s1, pak);
        char s2[1024];
        strcpy(s2, get_text(row2));
        float f2 = strsim(s2, pak);
        int result = sgn(f1 - f2);

You're using "sgn()" on floating-point numbers.  You will practically always get +1 or -1.  f1 - f2 may give a *signed zero* on a conformant floating point installation, and will likely give non-zero results due to roundoff error on any implementation.  Comparison of floating point numbers (-, >, <) is dangerous if you expect to ever deal with two equal numbers.   :police:

Rewrite strsim to use integers and this bug will probably go away.

EDITED: Verified.  If I add a debug line to print out the comparison between the floats, I get this:

Debug: gui_file_table_pak_column_t::compare_rows() First P****:   "0.142857" > "0.142857"

;)

Floating point numbers cannot be used to generate a canonical comparison order.  Even if you put a check for equality of strings up at the top of the routine, you'd still have to deal with the case of two different strings generating the same floating point number -- then the comparison of the "equal" floating point numbers could go different ways each time you do it.

FURTHER EDIT:

You can actually use floats if you want to.

The "floating point way" to test for equality is this:\
Code: [Select]
float diff = f1 -f2;
if ((diff > -.0001) && (diff < .0001) {
  //....they're equal enough.....
  return 0;
} else {
  return sgn(diff);
}
Exact equality practically never occurs in floating point arithmetic, so you *always* have to do something like this.

Re: Infinite loop

Reply #9
Just for curiosity: Why not using vector template anyhow? That would always check boundaries ...

Re: Infinite loop

Reply #10
@neroden
Thanks Nathanael!
I changed strsim() to return an sint32 to eliminate the risky float comparison.
Could you please test once more before I reduce the extensive debug output.

@prissi
list_tpl.get() and list_tpl.operator [] check boundaries too, but I didn't use them in qsort().
And I didn't use vector_tpl, because I needed more feastures than it provided and i didn't want to modify it for safety/stability reasons.
Actually I started with it before I decided to write another template. List_tpl is built for pointers to (polymorphic) objects.
The journey is the reward!

Re: Infinite loop

Reply #11
Thank you all for your help in fixing this!
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Re: Infinite loop

Reply #12
@neroden
Thanks Nathanael!
I changed strsim() to return an sint32 to eliminate the risky float comparison.
Could you please test once more before I reduce the extensive debug output.
Works now!  Though I had to add some header magic to compile since MAXINT isn't defined by default on Linux (the change is on my jp-devel branch).