I've got a game going on a 1024x1024 map with several large towns and 8 fast trains, (4 Northbound 4 Southbound) Multiple trams running in all full growth towns (15000 or over). The game has become extremely laggy with no seeming cause. It can happen at the start of a month, two days in or halfway. All not needed processes are shut down and the graphics card is running at 'Turbo'.
The same game played in Experimental suffers no lagginess at all.
Maybe a look should be taken at Knightlys 'New Path Search' to see if it can be incorporated into Standard? Or maybe a new type of search path could be written to suit Standard?
I suspect this is not a pak64-specific issue...
Meanwhile, have you compared the performance of the GDI executable vs. SDL?
Yes I have, It doesn't make any noticable difference. I haven't tried it with PAK128 as I only have PAK128 Britain which is a EXP game. I would try PAK128 but it takes weeks to get to the stage where I have PAK64. But you could be right it may be a Simutrans-Standard problem.
Can you post your savegame?
Ok posted to http://simutrans-germany.com/files As Colin's Standard 2611.sve
(Moving to Development...)
Direct link to the savegame is http://simutrans-germany.com/files/upload/Colin%27s%20Standard%202611.sve (http://simutrans-germany.com/files/upload/Colin%27s%20Standard%202611.sve); (EDIT:) it is loadable with plain pak64/r175.
I have just tried Colin's save game.
There was a brief period of reduced responsiveness immediately after loading, but soon afterwards the game ran smoothly. When I tried to add a halt or adjust certain convoy's schedule, responsiveness was reduced again shortly afterwards for quite some time.
So, probably the reduced responsiveness is caused by the re-routing of all existing ware packets in one single world step after set_schedule_counter() is triggered by actions like halt construction/deletion or schedule changes.
I've noticed the same, while trying Colin's game. So it's the rerouting of goods, which makes the game laggy...
Reduced by quite a lot and when building a game, because one is continually adding halts or new schedules, it happens with repeated monotony. Every few game days actually, and not counting Auto Save which I've extended to every 3 months. Brave little Aussie fellow that I am. ;D ;D ;D
Gerw, is it possible to spread the load of re-routing ware packets across a few steps?
Colin, I understand how poor the responsiveness would be when you continue to add/remove halts or adjust schedules, because I have a similar PC specifications as yours ;) The game almost freezes during such global re-routing.
It's currently spread across 32 steps, see spieler_t::step. But maybe this number should depend on the time, the rerouting needs.
Thanks a lot for pointing out. :) It is quite incredible that after spreading re-routing across 32 steps, the game is still so unresponsive.
But one can think of a worst case, where all the 'expensive' stations (with many waiting goods) are rerouted in one step...
At least, there wasn't this problem in 99.17.1.
But 100.0 and after versions have this problem.
I heard many people saying 100.0 and after versions are too heavy.
My observations with Colin's savegame:
1) After loading of the savegame, and also after deployment of a new schedule or a new goods category for an existing schedule, ST becomes unresponsive for some seconds, with down to 2(!) fps and 100% cpu/core utilization. => ST seems to recalculate routing for all goods packets, not just for the goods categories served by the new schedule/convoy. This should be fixable by filtering before re-routing.
2) The previous observation especially applies to the change of routing in an isolated network, which seems to still lead to re-calculation of routing on the whole map. For isolated freight lines/networks, it may be useful to limit the refresh, e.g. by walking through the graph of stations *and* connected convoys, also filtering(!) by goods category (see above), marking the stations (also convoys due to their contents?) for refresh, and refreshing only the marked ones in the next loop (removing the marker here again).
1)+2) could be changed as described (or by other means), but a large, fully interconnected P+M network will still not benefit directly from such changes, only indirectly by optimized routing for certain other cases. Further spreading of re-routing calculation may help.
3) side-note: station coverage setting in the savegame is 3 (=> 49 tiles to check at P+M+G generation), which doesn't matter for re-routing, but the additional calculation effort for newly generated p****engers+mail increases overall load.
I have just read whoami's post and it seems to me that he has hit the nail on the head, as they say. If his suggestions can be implemented we should see a, not vast, but quite substantial in crease in game flow. (Incidentally my FPS went down to 1)
I could set the station coverage back to 2 but surely that would lead to an increase in Trams/Tram lines/Stations, creating even more route finding?
I have question re this particular game.
modnote by whoami: this part is now discussed in http://forum.simutrans.com/index.php?topic=3203.msg31352#msg31352 (http://forum.simutrans.com/index.php?topic=3203.msg31352#msg31352):
I have a p****enger train (Not shown in the uploaded game) Which travels between two stations only. It leaves one station with over 3000 pasengers and arrives at it's destination where they all disembark. It makes a loss of over 41 thousand (not in travel costs) when the p****engers leave. It returns home with over 300 p****engers and makes a profit of over 22k when the p****engers leave. Have I got a setting in wrong in the config.tab or is this another bug?
As I wrote, my suggestions (if they matter at all) will not help (much) in the case of the p****enger/mail network.
Directly, it will only increase routing calculation effort to a smaller extent, because of the larger number of stations ("hops"). But usually, you will have more different lines, more interconnection points and more transfers, and that makes a big difference.
(I ****ume that you don't refer to ST-experimental.)
pay_for_total_distance = 2 can lead to such a result.
Also, cost deduction at switching of the month might be involved.
Do you mean that if I make multiple smaller Tram networks in my cities instead of one large network it would improve the flow of the game?
In Experimental the train made $721 profit on the outward journey with 3078 p****engers as apposed to a $41k loss in Standard.
Additional info re-train behaviour.
Train travels A-B-A.
From A to B
journey 1. 3078 p****engers, loss on arrival at B -$41k plus.
Journey 2. 3078 p****engers, loss on arrival at B -$42k plus.
Journey 3. 3078 p****engers, loss on arrival at B -$43k plus.
From B to A
journey 1. 325 P****engers, profit on arrival at A $22k plus.
Journey 2. 476 P****engers, profit on arrival at A $16k plus.
Journey 3. 124 p****engers, profit on arrival at A $4k plus.
This doesn't make sense and it doesn't matter if I set the pay_for_total_distance to 0, 1, or 2. I realise that these settings may not affect a current SVE game. But it doesn't detract from the nonsensical profit and loss that is being generated.
More different lines means more schedules and transfers to check, so goods routing calculation will need more CPU time. From this aspect, a larger station coverage can support in reducing calculation effort.
The performance-wise effect of the station coverage setting depends, however, also on your network layout: The more tiles with houses are covered by several stations, the slower the game may become, because a generated p****enger will check more than one station for suitable connections. But the developers can provide a better, substantiated analysis on this complex problem.
As your savegame doesn't contain an extremely large network, there may be a performance bug. Additionally, Z9999 may be right when he said that ST has become "too heavy".
Regarding the obscure revenue: It's unrelated to the program performance problems, so could you please open a new topic for that, also providing the savegame with the train? After that, I will move my own replies and any remaining related text there.
Oh dear! I thought my SVE contained a fairly m****ive network. I must be really naive.
Ok I'll upload my SVE with the latest additions and then open a new topic.
The station coverage is not important.
The problem in your savegame are not the size of the map but the shear amount of waiting persons. You have several stations with more than 300000 people waiting. Updating such a monster will of course take some time.
I will look into the code to make it smoother, although no promises ...
Thank You Prissi, I will look forward to any improvements you can make.
In my next game (which may be awhile away due to this one being nowhere near completed) I will reduce the P****enger Level.
@ Prissi,
Does nightly 2640 contain your changes?
Hi Colin :)
According to the source code update history, Prissi's changes for speeding up re-routing have already been incorporated. So, does it improve the responsiveness of your game?
Knightly
No, if anything it's getting worse. I'm sorry to say.
Same here.
What we want is not to solve jumping convoi, but to solve unresponsive.
In recent simutrans, idle time is zero during long time even though convoi keep moving well.
After looking at Prissi's changes, I think one of the reasons why the unresponsiveness problem has aggravated lies in the fact that >= 8192 ware (not ware packets) are re-routed in each step. For hub stations with large-sized ware packets (i.e. large
menge), this will indeed help to prevent too much re-routing work in one step; but with non-hub halts, they usually have only small-sized ware packets (please see the attached image), so it needs to process many non-hub halts before the limit of 8192 can be reached.
IMHO, re-routing workload depends not on packet size (menge), but on packet count. More packets, more re-routing work, and vice versa. So, I would suggest to change from
counting ware menge to
counting the number of packets processed. Of course, the limit of 8192 also needs to be reduced accordingly.
The problem in Colin save are that there are several stations with 300000 p****engers. Updating (rerouting) them takes nearly a second on my computer. During this no user interaction is possible, since such can be only done outside a step. The only other way would be that the stations themselves only handle 8192 each step, and indicate the result then. But this would require more complex logik, as changes of the p****enger numbers must be also taken into account (i.e. something like a dirty flag for ware in stations.) I will look into this next week.
Edit : Post removed as the provided screenshot was wrong.
Knightly,
I really only stumbled accross this thread randomly, but looking at your screenshot, I strongly suspect that you fell victim to a display bug: the sum of p****engers of the packages displayed is never 300k, not even close. I guess lots of destinations with letter >B are missing.
Display works only with integer coordintes with 16bit; therefore I think the scrollbar just messed up. It is very easy to time those stations and they have like 4000-30000 packets and cn take up to 800ms or rerouting (on this computer here).
Prissi,
Really sorry! I didn't realise that the list is incomplete. I loaded Colin's save game again and looked at the same station, and the list kept growing gradually when I tried to scroll down.
Nevertheless, I think it is more logical to count the number of packets instead of the sum of menge. Hope you can consider this suggestion when you look at the code later.
Knightly
[offtopic]
this:
should be fixed in revision 2642.
[/offtopic]
In the current implementation, if the goods of a station are rerouted, simutrans does a BFS for each packet. Since no heuristic is used, it is possible to do this in only one BFS, until the target for each packet is found.
I will try to implement this, if there is any free time next week...
Please check again with curretn nightly. Should be much smoother (only lagging on my eeePC).
bool haltestelle_t::step()
(...)
else if(reroute_counter!=welt->get_schedule_counter()) {
// all new connection updated => recalc routes
if( !reroute_goods() ) {
and
bool haltestelle_t::reroute_goods()
{
// reroute only on demand
reroute_counter = welt->get_schedule_counter();
(...)
// break after a certain number of reroute actions
if( ++packets==255 ) {
return false;
}
Then, after breaking, reroute_goods() will not be called again, isn't it ?
Thank you for spotting this out. Indeed, the update needs to be moved to the end ...
I tested with Colin's savegame.
For 1184 Halts.
rebuild_destinations() used 106 steps and reroute_goods() used 210 steps.
Sum of them is 316 steps.
[EDIT]
Another game: 3 players and public halts
player 1: 36 halts used 12 steps
public halt: 945 halts used 97 steps
player 2: 122 halts used 8 steps
player 3: 90halts used 24 steps
sums: 1193 halts
Numbers of halts per step is.
older version 1193/32steps=37.28 halts/step
this version 36/12+945/97+122/8+90/24=31.74 halts/step
Not so bad.
I didn't tested it, but I think, the changes contains a bug: last_ware_index isn't reseted in the loop in simhalt.cc:781.
However, it would be faster, if the BFS is done for all goods in one step instead of distributing it to several steps... And 316 steps are roughly 60 seconds...
Doing in one step caused the lag that we are trying to get rid of ... and you are of course right about the last step.
But if you do all searches within one BFS run, it will take only a little bit more time than the BFS run for the packet with the most transfers. So it won't be more laggy than the current nightly and the rerouting would be done much faster.
Not sure, what "BSF" is; but rerouting all goods of one of the large stations took more than 100ms, with no user interaction (and thus no mouse movement) is possible. This is recieved as lagginess, which must be avoided. It severance of course depends much on the computer used.
I'd ****ume it means "breadth first search", and he wrote BFS which matches quite nicely on that.
Hi Guys,
I've just downloaded r2650 and I'm pleased to inform you that the game has improved at least 90%. There is still a bit of random jerkiness and it's reasonably difficult to drag a menu screen across the map, Apart from that. :award: :award: :award:
Thanks Gwer, and many thanks to Knightly who has tried his hardest to guide this addle brained gamer through my trials and tribulations.
It's a pity James doesn't listen to you a bit harder, mate. EXP might improve for the better.
It tooks only 100ms if you reroute each good for itself. If you take one "big" run, where you search the graph for all possible ways, you can calculate all new routes at once.
Wiki says to the Dijkstra algorithm (which is a generalization BFS or A* without heuristic):
So its better, to calculate all connections at once.
@Colin
The improvements in responsiveness are made by Prissi so far. Gerw's suggestion to combine search has not been incorporated yet. And you don't need to thank me -- I have done nothing at all to improve the game responsiveness in this case. :P
@Prissi
I have looked at the code and noticed that in spieler_t::step(), total menge (i.e. get_finance_history(0,HALT_WAITING)) is still used as a basis for limiting rerouting (though packet count is also used as a limit in haltestelle_t::reroute_goods()). IMHO, this will cause some steps to do more rerouting or less rerouting depending on the sizes of the packets.
I have created a patch to make spieler_t::step() dependent on packet count. It is based on r2650, relative to the trunk folder. You may notice that I have changed all last_index to last_catg_index -- this is only done to improve readability.
The proble is, that the initial connection calculation also takes some time. Since the routine does not know (yet) in which state it is, it has to have some guidance. THe current code is not finished, I plan to move it into haltestlle, which would iterate over all stops and then could also display the current stage of routine (for debug purposes).
Prissi,
I suppose the above reply is in response to my patch?
If you take a look at my patch again, I have decremented units_remaining by 2 for the case of rebuilding destinations (please see haltestelle_t::step()). So, at most 64 / 2 = 32 halts will rebuild destinations in 1 single step. If you think 32 halts in 1 step is too much, you can increase that number from 2 to say, 3 or 4.
Knightly
Tested again with same savegames.
Colin's savegame
Player 1: 1184 halts
Public: 3 halts
prissi's patch
rebuild_destinations() used 112 steps and reroute_goods() used 212 steps.
Sum of them is 324 steps.
Knightly's patch
rebuild_destinations() used 37 steps and reroute_goods() used 945 steps.
Sum of them is 982 steps.
Another game: 3 players and public halts
player 1: 36 halts
public: 945 halts
player 2: 122 halts
player 3: 90halts
sums: 1193 halts
prissi's patch
rebuild_destinations() used 94 steps and reroute_goods() used 246 steps.
Sum of them is 340 steps.
Knightly's patch
rebuild_destinations() used 30 steps and reroute_goods() used 551 steps.
Sum of them is 581 steps.
I like the idea of Knightly's patch because it reduces number of halts/step on normal state.
Although 500 steps mean two minutes. In multiplayer, when a station is placed every two minutes rerouting may never be completed then ... ALso iterating over all lines takes a constant amount of time for most stations, thus the step routine needs to know the state of each station to decide what to do. Furthermore one player are finished before others; the proper way out is most likey a incorporation into haltestelle_t.
@Knightly
I overlooked you attachment. Will have a look soon.
You can change it by units_remaining.
I think that anyway it isn't possible to play such a big game in network mode even though without this problem.
The question is which counter is reasonable.
I did some testing with this and yoshis game and I think I reasonable balanced it. Further tries are appreaciated.
@z9999 : Thank you very much for testing my patch :)
@Prissi : Thanks a lot for considering my patch and adopting it after adaptation.
As for Gerw's suggestion to combine multiple searches for each individual packet into a single search, I think he is right that such combined search can reduce search time significantly, by eliminating the need to rebuild search trees from scratch for each individual packet. However, I think search time cannot be reduced to such a short time as entailed by the single longest individual search, because each examined halt needs to be compared against a list of outstanding ware packets' destination halts to see if there is a match. With a large game like Colin's, some halts may have over 900 pax packets, so even a combined search have to take quite some time to finish, still causing unresponsiveness problem.
I think Prissi's current system has successfully distributed search time relatively evenly across a reasonable number of steps. Nevertheless, combined search can still be useful in this system : it is possible to determine the number of packets to perform path search as a
batch for the current category, using Gerw's combined search routine. If this can successfully reduce search time by a large percentage, we can consider increasing
units_remaining so as to shorten the lifecycle of a global reroute.
To be more concrete, here is a code block from haltestelle_t::reroute_goods() :
We can reroute as a batch a number of packets, determined by the smaller of (1)
units_remaining and (2)
(warray->get_count() - last_ware_index).
My apologies to Prissi. I did not realise that it was mainly your patch that fixed my game. Thank you mate.
yoshi's game is only 1 player and amount of waiting people are large.
****ume munti player and small amount of waiting people but poor PC.
I tested another game.
player 1: 9 halts used 1 step
public: 252 halts used 28 steps
player 3: 77 halts used 3 steps
player 4: 114 halts used 3 steps
player 5: 95 halts used 3 steps
player 6: 245 halts used 4 steps
player 7: 25 halts used 1 step
player 8: 104 halts used 2 steps
player 9: 31 halts used 1 step
sum of halts is 952 halts
Halts per steps
older version: 952/32steps=30 halts/step
prissi's patch: 9+252/28+77/3+114/3+95/3+245/4+25+104/2+31=284 halts/step
Every step reculc 284 halts in your patch.
If anyone is interested I've just uploaded my latest SVE using r2650 to the Simutrans File Sharing site. http://simutrans-germany.com/files/upload/Colin's%20Standard%202650.sve
Can anyone tell me why my uploads always have a Red 'File Delete' where the uploaders name usually goes?
If you sort the destinations of the 900 packets, you can use binary search and you will need at most log(900) = 10 comparisons each step (Maybe a hash table is even better?). Further, you can kick out the packets, for which the route is already found. Imho, it takes maybe the time of routing 5 packets (with long route).
@ Colin :
Once you have logged on to the file sharing service, files that you have previously uploaded will show a Delete link so that you can remove them if you want. That doesn't mean that those files have been deleted.
@ Gerw :
I see. Look forward to your patch then :)
I don't feel like patching now. I've got not much time this (and the next 2) weeks and I will wait until prissi has finished his work. Otherwise, I have to rewrite my patch thousand times... Moving the rerouting to haltestelle_t is a good idea, but I don't like the idea of splitting, since Imho my suggestion is more elegant.
And it would also be possible to process not a fixed number of stops (or with the splitting: packets), but as many as 100 ms are gone, isn't it?
Actually colins game is a little atypical, since usually never more than a few tens of packets are waiting in any stop for all the large savegames I ever got. Usually the RESCHEDULING takes more time, since one has to loop over a lot of convois and lines. Therefore building a search tree most likely will take longer than the actual rescheduling; also most stations have also in colins game very few people waiting.
Maybe, for stations above some limit, newly arriving packets should be discarded from the game.
Maybe the lineless convoys and lines should also be cached at the stations?
You don't need to build a search tree. You could take the code from suche_route, but in every step you test if the halt is the destination of any packet waiting there.
Thats probably because I try to provide enough transport for everyone. Mind you I just picked a station at random and it had 136231 p****engers waiting. Isn't that quite a lot?
Result of r2658. It seems to be good.
First 3 games are same savegame which were tested before, last 2 are new.
Colin's savegame (551 convois, 1187 halts)
r2650 rebuild_destinations()=112, reroute_goods()=212, sum=324
r2658 rebuild_destinations()=20, reroute_goods()=488, sum=508
2nd savegame (3 players, 6671 convois, 1193 halts)
r2650 rebuild_destinations()=94, reroute_goods()=246, sum=340
r2658 rebuild_destinations()=20, reroute_goods()=373, sum=393
3rd savegame (9 player, 870 convois, 952 halts)
r2650 rebuild_destinations()=27, reroute_goods()=28, sum=55
r2658 rebuild_destinations()=14, reroute_goods()=42, sum=56
regbombay1's London_C.sve (4297 convois, 8577 halts)
r2658 rebuild_destinations()=135, reroute_goods()=117, sum=252
fuzion_051's The World 3.sve (2791 convois, 839 halts)
r2658 rebuild_destinations()=70, reroute_goods()=368, sum=438
If they are for pak64 or some other plain vanilla pak (no addons etc) I would be very interested in your testgames.
Unfortunately, each savegame needs ton of addons, sorry.
r2664 SDL has done it for me. The game is running 99.99% more efficiently. I doubt if you'll get it any better, but it's amazing what you can do when you put your minds to it. Thanks a million guys, I'm enjoying the game again. :D :D :D. I don't use GDI so I can't comment on that.
Hi Prissi,
I have created another patch, based on r2682 and relative to the trunk folder, which speeds up route searching. The major changes are :
(1)
The check for destination halt is moved inside the loop which iterates over reachable halts of the current halt. It helps to find the destination halt earlier on and skip some unnecessary iterations and checking.
(2)
A per-ware-category counter for each halt is added, which records the number of lines/lineless convoys serving that halt for a particular ware category. This info is then used in suche_route(). Only origin halt, transfer halts and destination halt will be added to the HNode array.
I have performed some tests by rebuilding all halts' reachable destinations and rerouting all halts' goods in one step, recording the time used. Below are the results (the figures are in milliseconds) :
Before applying the patch :
**** Colin Rebuilding destinations takes : 174 Rerouting goods takes : 43746 Complete refresh takes : 43920
**** Yoshi Rebuilding destinations takes : 5063 Rerouting goods takes : 2039 Complete refresh takes : 7102
**** 1982 Rebuilding destinations takes : 40 Rerouting goods takes : 332 Complete refresh takes : 372
**** Wipi Rebuilding destinations takes : 8 Rerouting goods takes : 67 Complete refresh takes : 75
**** Ferndale Rebuilding destinations takes : 3 Rerouting goods takes : 20 Complete refresh takes : 23
| | After applying the patch :
**** Colin Rebuilding destinations takes : 170 Rerouting goods takes : 4172 Complete refresh takes : 4342
**** Yoshi Rebuilding destinations takes : 5114 Rerouting goods takes : 684 Complete refresh takes : 5798
**** 1982 Rebuilding destinations takes : 34 Rerouting goods takes : 145 Complete refresh takes : 179
**** Wipi Rebuilding destinations takes : 7 Rerouting goods takes : 24 Complete refresh takes : 31
**** Ferndale Rebuilding destinations takes : 2 Rerouting goods takes : 3 Complete refresh takes : 5
|
From the above, it seems that rebuilding destinations only takes a little more time, but the reduction in rerouting time is quite significant. You may wonder why such reduction is especially larger in Colin's save game : it is because the proportion of transfer halts is lower in his game, and thus can benefit more from this patch.
Further testing is welcome.
Incidentally, I have removed the following cleanup code from suche_route() :
I think it is not necessary to reset the markers, as suche_route() only checks the
inequality between the current marker and a halt's marker.
Besides, I have renamed some variables to better reflect their functions to enhance code readability.
Knightly
Well, apart from the servicable_counter: Could the purpose not be fulfilled by the warenziele array anyway? If this is empty (or rather it will has only one entry) then this halt is only connected to us.
I.e. only adding when reachable_halt->get_warenziele[ware_catg_index].get_count() > 1 ? Since hat gehalten just adds to warenziele anyway.
Prissi,
The presence of reachable halts in warenziele does not indicate whether a halt is a transfer halt or not. A halt may have some entries in warenziele, but they may come from a single line's schedule. A transfer halt needs to have at least 2 lines/lineless convoys serving it, for a certain ware category.
Knightly
Interesting. I tested only reroute_goods().
Most interesting thing is London_C.sve with max_hops 8000.
It's really the same as max_hops 2000.
2nd savegame (3 players, 6671 convois, 1193 halts)
standard 9799 ms
Knightly 3885 ms
3rd savegame (9 player, 870 convois, 952 halts)
standard 677 ms
Knightly 221 ms
regbombay1's London_C.sve (4297 convois, 8577 halts)
max_hops 8000
standard 37259 ms
Knightly 4548 ms
max_hops 2000
standard 5517 ms
Knightly 4534 ms
fuzion_051's The World 3.sve (2791 convois, 839 halts)
standard 7979 ms
Knightly 4476 ms
Z9999 :
Thank you very much for testing. :)
As I have mentioned above, only origin halt, transfer halts and destination halt will be added to the HNode array. Thus, the number of HNodes used in a single search is largely reduced. Judging from your figures :
I think all or most route searches managed to complete within the 2000 max_hops limit. Probably the proportion of transfer halts is rather low.
Knightly
Additional information. The percentage of transported p****engers.
regbombay1's London_C.sve (4297 convois, 8577 halts)
max_hops 2000
standard 9%
Knightly 83%
:award:
I think the size of the HNode array can be reduced (by half?) to compensate for the additional memory consumed by the service_counters.
Here is my latest STD SVE you can see the improvement that has been made also please notice thatthe game has gone from 470mil in the red to 10mil in the black.
Thanks to gwer/Prissi/Knightly and everyone else who contributed to making this game playable. If I didn't mention your name I'm sorry, but I do know who you are, Fred, er, Joe, No its Vil something or was it Z9***.
Anyway Thanks
http://simutrans-germany.com/files/upload/Colin's Standard 2682.sve
I have attached an updated patch, also against r2682 and relative to the trunk folder. It includes 2 bugfixes and reduces the size of HNode array by half.
Knightly kindly sent me a copy of the GDI version of his patch and I'm here to tell you, the flow of my game improved beyond all expectations. The traffic (all modes) flows smoothly, there is no jerkiness of movement and no pauses at Auto Save.
As I pointed out to Knightly, GDI version has never worked with larger maps on my computer. I've always used SDL. With his patch I can now use GDI.
Anyone who has downloaded my latest SVE should try it with this patch.
Thanks Knightly.
Only the service counter is not included, since to me it seems like the warenziel-list does essentially the same.
Prissi,
Sorry, but I don't think so.
A halt X served by a single pax line which has a schedule containing, say, 8 halts, will end up having 7 halts in warenziele[0] for pax. That fulfils your criteria :
but X is definitely *not* a transfer halt. A transfer halt is where ware packets can switch lines/convoys. If you think that I am wrong on this, please state the reasons.
The concept of transfer halt is the *gist* of my patch, and distorting it will largely diminish the usefulness of this patch. My centralised search in ST EXP also focuses on transfer halts, and it works fine for a long time without any problem, and has significantly reduced search time. I can't see why you have to insist on removing the service counters (which is actually the core of my patch), unless it really takes up a lot of space or processing time per se.
However, the addition of service counters only consumes a little bit more of memory. 1000 halts with 16 ware categories will only use an additional (16 categories + 4 byte pointer) x 1000 halts = ~20KB. And the additional processing time for clearing and incrementing the service counters is relatively small, according to my test results. You may perform profiling to be accurate.
Knightly
But your patch, if I have two convois with the same schedule, then it will be too larger than 1 ... Or did I mistake it completely? But it would be still an improvement for any case without this.
I will look at it again.
You are right, but only if both convoys are lineless convoys. I think most players will group 2 lineless convoys with identical schedules into a single line for easier management, so this is not much of a problem in general.
You can solve this problem by modifying hat_gehalten() to return a boolean value which indicates whether any halt has been added to warenziele in this function call, and increment the service_counter[catg] in rebuild_destinations() only if that return value is true. But I doubt if the additional processing time justifies the gain in general.
BTW, do you mind performing profiling on my patch? I am interested to know how much faster suche_route() is, and how much slower rebuild_destinations() is, after the modifications. Many thanks in advance! :)
Edit 1 :To make sure that you understand the difference between using service_counter[catg] and your suggested check :
I have attached a simple diagram (1st attachment) to illustrate it. ****uming only pax are involved. The letters are the names of the halts, and the numbers in bracket refer to the number of halts in warenziele[0]. The blue circle is served by Line P (A>B>C>D>E), and the green circle by Line Q (E>F>G>H>I). In this setup, only Halt E is a transfer halt.
With my patch, only Halt E will be identified as a transfer halt, since there are 2 lines (P & Q) serving it (i.e. service_counter[0] > 1). Searching route from A to H will only add origin halt A, transfer halt E and destination halt H to the HNode array.
But with your suggestion, all halts satisfy your criteria (i.e. warenziele[0] > 1), thus any intermediate non-transfer halts may also be added to HNode unnecessarily during the search. In this particular case, nodes are added for A, B, C, D, E, F, G, H during the search from A to H, wasting processing time -- the waste is not just on adding the nodes, but also on additional processing time involved in processing that node afterwards.
As you can see, my original code and your suggestion are very different, and your suggestion will largely reduce the effectiveness of my patch in reducing search time. Service_counter[catg] is the cornerstone of my patch, and it is necessary to make my patch work as intended. So, unless you can find a better way to determine transfer halt, please do not remove service_counter from my patch.
As a sidenote, during my development of ST EXP's centralised search, I realised that most games usually have only 50% or less of its halts as transfer halts (of course, there are exceptions, depending on player's strategies). That is the reason why focusing on transfer halts in suche_route() can significantly reduce search time.
Edit 2 :Just in case you are not sure why we don't need to add non-transfer halts to the HNode array, please take a look at the 2nd attached image. In this diagram, there is only 1 single line serving halts A to E.
When searching for the route from A to E, A as the origin halt will first be added to the HNode array. Then, A's reachable halts (namely, B, C, D and E) are checked, one by one : with B, it is checked against the destination halt to see if we have found the destination, but no, this is not the destination; next, check if it is a transfer halt, but no, so no more processing is needed.
Rationale : if B is a non-transfer halt, there is only a single line/lineless convoy serving it; if current node A has B as its reachable halt, then A must also be served by the same line/lineless convoy, which in turn means that B's reachable halts are included in A's reachable halt's list (except A of course) and will be checked while processing A's reachable halts. Thus, there is no need to add B to HNode array, and no need to check B's reachable halts.
Even so, nothing worse than current bahavior.
In current simutrans, search priority of hub station is very low, and p****engers can't reach their destination. That causes poor game flow, if player set high max_hops.
In current simutrans, I can't play regbombay1's London_C.sve withmax_hops 8000 because of Re: poor game flow. But max_hops 2000 is not fun, bevause only 9% of p****engers can travel.
@Knightly
Can I use your patch for any fork version of simutrans, if this was denied ?
It will be incorporated. I will probably just get this a more descriptive name (like number_of_non_unique_schedules or so ... )
@ Z9999
Sure, no problem :D Please take the liberty to use it anywhere you like.
@ Prissi
You mean "service_counter"? Actually I am not sure how it should be named. "service_counter" is the best I can think of, meaning "serviced by how many lines/lineless convoys", but I admit that it is not ideal ... If you can think of a better yet concise description, please feel free to change it. ;)
@Knightly
Thank you.
Prissi,
A suggestion regarding connections_searched in rebuild_destinations() : instead of incrementing this variable by 1 per line/lineless convoy, we can increment it by fpl->get_count(). I propose this because players seem to vary a lot as to the size of their schedules : some prefer to have 7-8 halts per schedule, but others may prefer >30 halts. Longer schedules take longer to process, thus by using fpl->get_count(), you can better control the pace of rebuilding destinations. Of course, your formula for decrementing units_remaining in haltestelle_t::step() needs to be changed accordingly.
Knightly
That is a valid suggestion. ALso it came to my mind, that it is very easy to check, whether a schedule added something to a connection or not, i.e. whether there are more than one identical schedule: The counter of warenziele is large after addition. WIll go to implement that and your patch tonight I think.
Good idea. A simple check like this should do.
For games like Yoshi's with numerous lineless convoys, I agree with Gerw that the only way to reduce the time involved in rebuilding destinations is to register them at the halts like registering the lines. Do you think it is feasible?
Yoshis game is from an 87 version several years ago where line management was more clumsy ... and it does not contains so much lineless convois. I think this could be deferred, but if you want to have a go at it. Certainly it may give advantage; but then convois need to handle changes from and to lines even more carefully.
EDIT:
I did profiling, but the differences are small; moreover, most improvement seems from less calls to halthandle_t::is_bound() and halthandle_t(). But as z9999 said, I think most people actually finding a route is more important.
Do you mind posting the profiling results? Many thanks.
Well, they are about 12MB, and the change is not seen easily. But here you go, everything up to 0.5% contribution, no calling stack:
Flat profile: (improved system)
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
9.29 19.95 19.95 16647 0.00 0.00 karte_t::sync_step(long, bool, bool)
5.83 32.47 12.52 1529605 0.00 0.00 haltestelle_t::suche_route(ware_t&, koord*, bool)
5.00 43.20 10.73 5044991 0.00 0.00 display_img_nc(short, short, short, unsigned short const*)
3.82 51.41 8.21 1869586 0.00 0.00 haltestelle_t::hole_ab(ware_besch_t const*, unsigned int, schedule_t const*, spieler_t const*)
2.97 57.80 6.39 1460434544 0.00 0.00 vector_tpl<ware_t>::get_count() const
2.15 62.42 4.62 2094715 0.00 0.00 hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::put(sync_steppable*, sync_steppable*)
2.12 66.97 4.55 940084741 0.00 0.00 vector_tpl<ware_t>::operator[](unsigned int)
2.03 71.34 4.37 696776484 0.00 0.00 quickstone_tpl<haltestelle_t>::is_bound() const
1.87 75.35 4.01 192406260 0.00 0.00 ding_t::get_flag(ding_t::flag_values) const
1.82 79.27 3.92 130461686 0.00 0.00 planquadrat_t::get_boden_in_hoehe(short) const
1.75 83.02 3.75 549425413 0.00 0.00 slist_iterator_tpl<hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::node_t>::next()
1.69 86.66 3.64 32334462 0.00 0.00 convoi_t::sync_step(long)
1.67 90.25 3.59 233174821 0.00 0.00 grund_t::get_hoehe() const
1.67 93.83 3.58 574663478 0.00 0.00 quickstone_tpl<haltestelle_t>::operator==(quickstone_tpl<haltestelle_t> const&) const
1.65 97.38 3.55 152483261 0.00 0.00 dingliste_t::bei(unsigned char) const
1.64 100.91 3.53 937711102 0.00 0.00 quickstone_tpl<haltestelle_t>::operator->() const
1.54 104.22 3.31 240439540 0.00 0.00 karte_t::ist_in_kartengrenzen(short, short) const
1.52 107.49 3.27 18540007 0.00 0.00 convoi_t::calc_acceleration(long)
1.48 110.66 3.17 139927198 0.00 0.00 grund_t::get_weg(waytype_t) const
1.43 113.73 3.07 176234657 0.00 0.00 vehikel_basis_t::fahre_basis(unsigned int)
1.33 116.59 2.86 487695305 0.00 0.00 koord::koord(short, short)
1.20 119.17 2.58 249793805 0.00 0.00 operator==(koord const&, koord const&)
1.20 121.75 2.58 62523573 0.00 0.00 vehikel_t::get_gesamtgewicht() const
1.19 124.30 2.55 54759565 0.00 0.00 quickstone_tpl<simline_t>::is_bound() const
1.09 126.64 2.34 359156267 0.00 0.00 koord3d::get_2d() const
1.04 128.87 2.23 32300103 0.00 0.00 convoi_t::step()
0.95 130.91 2.04 67751873 0.00 0.00 grund_t::get_vmove(koord) const
0.94 132.93 2.02 82602392 0.00 0.00 ding_t::is_moving() const
0.90 134.86 1.94 16125608 0.00 0.00 dingliste_t::remove(ding_t const*)
0.83 136.65 1.79 648278617 0.00 0.00 vector_tpl<quickstone_tpl<haltestelle_t> >::operator[](unsigned int) const
0.75 138.27 1.62 89449878 0.00 0.00 fussgaenger_t::sync_step(long)
0.69 139.76 1.49 665227174 0.00 0.00 vector_tpl<quickstone_tpl<haltestelle_t> >::get_count() const
0.60 141.04 1.28 28862148 0.00 0.00 stadtauto_t::sync_step(long)
0.55 142.23 1.19 174167252 0.00 0.00 hashtable_iterator_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::next()
0.54 143.40 1.17 240375956 0.00 0.00 karte_t::lookup(koord) const
0.54 144.55 1.15 143737674 0.00 0.00 obj_besch_t::get_child(int) const
0.53 145.69 1.14 133776 0.00 0.00 route_t::intern_calc_route(karte_t*, koord3d, koord3d, fahrer_t*, unsigned int, unsigned int)
0.52 146.80 1.11 13741184 0.00 0.00 haltestelle_t::get_ware_summe(ware_besch_t const*) const
Flat profile: (previous system)
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
9.45 21.54 21.54 1483245 0.00 0.00 haltestelle_t::suche_route(ware_t&, koord*, bool)
8.50 40.93 19.39 17067 0.00 0.00 karte_t::sync_step(long, bool, bool)
4.95 52.22 11.29 5431731 0.00 0.00 display_img_nc(short, short, short, unsigned short const*)
4.43 62.31 10.10 1645188374 0.00 0.00 quickstone_tpl<haltestelle_t>::is_bound() const
2.85 68.81 6.50 1898991967 0.00 0.00 quickstone_tpl<haltestelle_t>::operator->() const
2.03 73.44 4.63 1596670257 0.00 0.00 vector_tpl<quickstone_tpl<haltestelle_t> >::operator[](unsigned int) const
1.89 77.77 4.32 2050348 0.00 0.00 hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::put(sync_steppable*, sync_steppable*)
1.89 82.06 4.30 963837657 0.00 0.00 vector_tpl<ware_t>::get_count() const
1.89 86.36 4.30 132048612 0.00 0.00 planquadrat_t::get_boden_in_hoehe(short) const
1.82 90.52 4.16 1817492 0.00 0.00 haltestelle_t::hole_ab(ware_besch_t const*, unsigned int, schedule_t const*, spieler_t const*)
1.71 94.41 3.89 188892750 0.00 0.00 ding_t::get_flag(ding_t::flag_values) const
1.69 98.27 3.86 155337440 0.00 0.00 dingliste_t::bei(unsigned char) const
1.66 102.06 3.79 1666172006 0.00 0.00 vector_tpl<quickstone_tpl<haltestelle_t> >::get_count() const
1.65 105.82 3.76 237458275 0.00 0.00 grund_t::get_hoehe() const
1.55 109.36 3.54 32458683 0.00 0.00 convoi_t::sync_step(long)
1.51 112.80 3.44 246075434 0.00 0.00 karte_t::ist_in_kartengrenzen(short, short) const
1.51 116.24 3.44 141813284 0.00 0.00 grund_t::get_weg(waytype_t) const
1.50 119.65 3.41 528617071 0.00 0.00 slist_iterator_tpl<hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::node_t>::next()
1.48 123.02 3.37 172490871 0.00 0.00 vehikel_basis_t::fahre_basis(unsigned int)
1.46 126.36 3.34 18589421 0.00 0.00 convoi_t::calc_acceleration(long)
1.32 129.37 3.01 496337298 0.00 0.00 koord::koord(short, short)
1.16 132.01 2.65 32234028 0.00 0.00 convoi_t::step()
1.13 134.58 2.57 54660352 0.00 0.00 quickstone_tpl<simline_t>::is_bound() const
1.10 137.08 2.50 367527603 0.00 0.00 koord3d::get_2d() const
1.07 139.52 2.44 68549744 0.00 0.00 grund_t::get_vmove(koord) const
1.04 141.90 2.38 252639444 0.00 0.00 operator==(koord const&, koord const&)
0.98 144.13 2.23 577424393 0.00 0.00 vector_tpl<ware_t>::operator[](unsigned int)
0.97 146.35 2.22 62518948 0.00 0.00 vehikel_t::get_gesamtgewicht() const
0.94 148.49 2.14 83061006 0.00 0.00 ding_t::is_moving() const
0.78 150.27 1.78 291959379 0.00 0.00 quickstone_tpl<haltestelle_t>::operator==(quickstone_tpl<haltestelle_t> const&) const
0.78 152.04 1.77 85600845 0.00 0.00 fussgaenger_t::sync_step(long)
0.78 153.81 1.77 15981170 0.00 0.00 dingliste_t::remove(ding_t const*)
0.70 155.40 1.59 69182787 0.00 0.00 haltestelle_t::get_warenziele(unsigned char) const
0.62 156.81 1.41 28987517 0.00 0.00 stadtauto_t::sync_step(long)
0.60 158.18 1.37 246011854 0.00 0.00 karte_t::lookup(koord) const
Indeed, suche_route is much faster, it is no longer the most time consuming subroutine any more. I think the next release will be a big step towards playing larger maps or better perfomance on netbooks.
I have written another patch to speed up suche_route() a bit when checking those already-processed halts. Although we now only check transfer halts, transfer halts also have reachable halts which have already been processed. Originally, halthandle boundness is checked before checking
marke, meaning that 2 checks are required. But the boundness check is actually unnecessary for these already-processed halt entries.
This patch moves
marke out of haltestelle_t, and consolidates them into a static array. In this way, we can check the marker first, and skip the boundness check if this halt has already been processed before.
For small games, the effect of this patch is not noticeable; but for large games like Colin's and Yoshi's, there is a noticeable reduction in search time. In general, this patch will have a greater effect on games with a higher proportion of transfer halts.
The static array consumes 262KB of memory, which is the amount of memory saved by reducing the size of the HNode array by half in the previous patch.
Here are some testing results :
Before applying the patch :
**** Colin - Finished **** Rebuilding destinations takes : 112 ms Rerouting goods takes : 7487 ms Complete refresh takes : 7599 ms
**** Yoshi **** Rebuilding destinations takes : 4963 ms Rerouting goods takes : 616 ms Complete refresh takes : 5579 ms
**** 1982 **** Rebuilding destinations takes : 40 ms Rerouting goods takes : 149 ms Complete refresh takes : 189 ms
**** Wipi **** Rebuilding destinations takes : 9 ms Rerouting goods takes : 19 ms Complete refresh takes : 28 ms
**** Ferndale **** Rebuilding destinations takes : 2 ms Rerouting goods takes : 5 ms Complete refresh takes : 7 ms
| | After applying the patch :
**** Colin - Finished **** Rebuilding destinations takes : 115 ms Rerouting goods takes : 6855 ms Complete refresh takes : 6970 ms
**** Yoshi **** Rebuilding destinations takes : 4945 ms Rerouting goods takes : 528 ms Complete refresh takes : 5473 ms
**** 1982 **** Rebuilding destinations takes : 40 ms Rerouting goods takes : 147 ms Complete refresh takes : 187 ms
**** Wipi **** Rebuilding destinations takes : 8 ms Rerouting goods takes : 18 ms Complete refresh takes : 26 ms
**** Ferndale **** Rebuilding destinations takes : 2 ms Rerouting goods takes : 3 ms Complete refresh takes : 5 ms
|
Note : Colin's save game is different from the one used previously.
Further testing is welcome.
This will certainly perform different on different processors, the longer the pipeline, the better.
I am quite torn on this; even in your old patch you used an unchecked array, which was marginally ok, since it was only accessed with an predefined counter. However, for a large project like simutrans, I would like to keep objects together. The mark of a station belongs imho to the station.
Of curse, that is only my view. But there were many more places liek the different tree types. The are accessed via a 9 bit variable. Nevertheless I put them into a vector, which will check boundness at every access, to catch a broken register or some other programm error before really bad things happen. Even if this costs same percent performance.
People tend to forget, what things belong together. Keeping things together with their data was imho the main idea of OOP. Maybe I have become to fundamentalistic in my views. But I am certainly open to challenge my optionions by arguments.
Prissi,
Most probably you are right on this. Pipeline issue aside, this patch will avoid many unnecessary halthandle boundness checks. More specifically, each halt's handle will be checked for boundness for no more than once in a single suche_route() call.
If
non_identical_schedules were to use vector, then I believe
warenziele should also be a vector of vectors too, right? IMHO, if our code are written such that errors are already guarded against, why should we compel ourselves to use vectors to incur unnecessary overhead?
I think this is more a point of view than hard fact. Extension buildings are part of a station, no doubt. But for markers, I think its sole purpose is to serve suche_route(), but not inherently part of a halt by definition. It's only working data for suche_route().
I agree that things that go together should be grouped together in their respective cl****es. And I agree with the doctrines of OOP too. But sometimes we tend to
****ume that certain variables should belong to a certain cl**** of objects; but if we think deeper, there may be no necessary connection between them.
As for the patch, I know in advance that you may not like the static array. Actually, if I do not need to initialise the array elements with zeros, I would have declared it as a static local array inside suche_route() like the HNode array, for I believe the markers are only working data of suche_route().
Edit :I have made a small change to the patch (moving the init_markers() call out of an IF block in karte_t::init()). Updated patch attached.
Knightly
Trying to deliver arguments pro separate markers:
If the lifetime of data differs much, I think this is a sign that they are independent, and can/should be kept independently.
Stations have a lifetime of program runtime, the markers only for a search.
Also, if we think about threading, such marks must not be part of the station data, otherwise multi-threaded searches are not possible.
Looking at the code, I feel like it would make sense to make the markers part of halthandle_t, i.e. derive a cl**** from quckstone_tpl. That way also the initialisation would be automatically correct. Even more, the marked and mar test could be then part of the more intelligent halthandle too. But I would like to here you opinion.