Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1] 2 3

Author Topic: FPS Science and observations with 1000 dorfs population  (Read 5800 times)

Razzim

  • Bay Watcher
    • View Profile
FPS Science and observations with 1000 dorfs population
« on: March 13, 2023, 10:54:03 am »

Inspired by the famous revelation from Putnam regarding the pathing impact, since 2 months I've been doing series of experiments on impact of different embark/population configurations on FPS.

I. The setup

1. Most of my experiments are made with a population of 1000 dwarves, to make the tests as consistent as possible, and to focus even more on just units impact on fps.

2. Most worlds are generated as small or medium regions with default settings, as I found them the most consistent.

3. Most of testing is made on 16x1 embarks (a so called noodle-embark), because its the only embark size which has the ability to handle 1000 dwarves. More on that later.

4. Most of my test designs are open. A wholy dug checkerboarded single z-levels. Something like:

XOX
OXO
XOX

X - dig
O - wall

5. To spread the dwarves evenly throughout areas, I create perperdicular stripes of meeting halls, separated by around one screen length (not because its the most efficient, but because its the easiest and fastest to setup multiple times). Roughly looks like this:

MXOXOXOMXOXOXOXOXMOXOXOXOXMXOXOXOMXOXOXOXOXMOXOXOXOXM
MOXOXOXMOXOXOXOXOMXOXOXOXOMOXOXOXMOXOXOXOXOMXOXOXOXOM
MXOXOXOMXOXOXOXOXMOXOXOXOXMXOXOXOMXOXOXOXOXMOXOXOXOXM

X - dig
O - wall
M - meeting hall

6. All of my tests are made without zooming, as it's proven to affect performance. So 2x zoom out from max zoom.

7. My recent test embarks are only on deserts without any vegetation, to avoid trees/herbs interfering with the results

8. All testing worlds have all 3 caverns enabled.

9. Most of the testing made on v50.07.

Relevant info about my PC:
- i7-12700k at stock clock
- 32gb of 6000mhz, cl34 ram

II. To make my science as much uninterfered with various variables, I use considerable amount of cheating, which include:

1. Debug tools in df.global structure:
- debug_noberserk
- debug_nodrink
- debug_noeat
- debug_nomoods
- debug_nosleep
- debug_fastmining

2. "fastdwarf 1 1" DFHack command to quickly dig the huge areas required for experimenting, changed to "fastdwarf 0 0" after setup is done.

3. "migrants-now" DFHack command to quickly pool huge amounts of dwarves in a matter of days, to count out any world changes impacting the fps

Thanks to all the cheats, absolutely 0 produced items interfering with the results.

III. The results

100% certain as of now, proven through countless different embarks now:

1. The feature of skipping interactions between dwarves if they are 26 or more tiles apart is by far the most impactful on performance. Its the mechanism that allows to have 1000 dwarves in a single embark with reasonable fps (at one point even 65 fps after culling all migrated animals!)

2. The most performance efficient embark for my testing conditions is 16x1 (and probably 1x16, but I haven't tested that one). It exploits the 26 tile skipping mechanism the most while still having relatively small amount of land to calculate/path around. With 1000 dwarves and their migrated animals on 16x1 embark i get around ~50 fps.

Not 100% certain, tested but not across every possible scenario:

1. I found checkerboarding of big areas beneficial for the fps, but for counterintuitive reasons. With the same setup as in point I.5. above, but with just digging out an entire z-level I get 35 fps, compared to ~50 with checkerboarding. Why? I thought because checkerboarding limits LoS - nope.
Turns out checkerboarding just makes the dwarves more spread on y-dimension, so more chances for dwarves to escape the 26 tile calculation area. Bonus is less spreading of mist/miasma/dust.

2. 4x4 embark gives considerably lower fps compared to 16x1, despite having the same amount of land - see the 26 tile spread rule. 16x2 embark is a bit slower than 16x1 - I've got ~45 fps, probably because of more land/pathing to calculate. But I've done only one 16x2 test.

3. 26 tile skipping doesn't work on z-dimension. On my recent test day I made 4 areas as in point I.5. separated by 26 z-levels, and spread 1000 dorfs on those areas resulting in roughly 250 dorfs on each of them, then removed stairs connecting them. FPS results are absolutely the same as with single 16x1 z-level.

4. 1x1 embark is terrible for fps with a lot of dwarves. I made 2 versions of 1x1 fort. One with stairs connecting 16 different, checkerboarded z-levels, second one with ramps going left-right, to spread the dwarves evenly on each of 16 z-levels. Result? At 1000 dorfs 17 fps with stairs, 18 fps with ramps, despite having 16 areas of 48x48, so roughly the same as with single 16x1 z-level, and the advantage of potentially more LoS-breaks, as there were floors between each z-level.
Fun fact - 16x1 simple checkerboard z-level handles 2100 dorfs at 16 fps. Yep, 1x1 forts are pretty bad.

5. Both small and medium region embarks start with roughly the same fps. Small region suffers no changes to fps after 10 years of world development post-embark. Medium world fps gets cut in half in about 3 years, and stays like this - but i only tested 10 years, and on limited amount of worlds, so it might be heavily dependent on RNG.

6. Increased RAM frequency helps significantly with handling big amounts of dorfs in a small area, and falls in importance with more optimal fort designs (would need a lot more testing though).


As a bonus - a save with 1000 dorfs to test your PC capabilities. Enabling debug mode advised (for advanced users, description in the link), but should run for a bit before dorfs kill eachother. I get 52-54 fps on it. Also it gives a good view on my testing methodology.
https://dffd.bay12games.com/file.php?id=16524
Logged

Panando

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #1 on: March 13, 2023, 12:46:37 pm »

Very useful insights. So basically popular hang out areas should be separated by at least 26 tiles in the x/y directions.
Logged
Punch through a multi-z aquifer in under 5 minutes, video walkthrough. I post as /u/BlakeMW on reddit.

DwarfSpirit

  • Escaped Lunatic
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #2 on: March 13, 2023, 06:31:13 pm »

This is really interesting. My current 200 dwarf/4x4 embark fort design is heavy on verticality and pillars in which I struck a balance between the new knowledge about LoS/FPS and aesthetics, but the bedrooms, workspaces and stockpiles are all within 25z-levels of each other. I was already toying with the idea of decentralising across 3-4 'mini forts', so this is the push I need to finally use burrows for their intended function as district creators.... after playing this game for what, 10 years?

Excellent work - and thanks to you too, Panando. It is a time of great !SCIENCE!
Logged

Urist McNobody

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #3 on: March 13, 2023, 08:10:34 pm »

Is there a typo?  Reading these two statements:

Quote
2. The most performance efficient embark for my testing conditions is 16x1 (and probably 1x16, but I haven't tested that one). It exploits the 26 tile skipping mechanism the most while still having relatively small amount of land to calculate/path around. With 1000 dwarves and their migrated animals on 16x1 embark i get around ~50 fps

Quote
3. 26 tile skipping doesn't work on z-dimension. On my recent test day I made 4 areas as in point I.5. separated by 26 z-levels, and spread 1000 dorfs on those areas resulting in roughly 250 dorfs on each of them, then removed stairs connecting them. FPS results are absolutely the same as with single 16x1 z-level.

If 16x1 checkerboarded was the best-performing configuration, and separating 4x occupied levels by 26 floors performed the same as just one level of a 16x1 embark then why do you say that the distance check doesn't work?  Doesn't that imply that the distance check is also sensitive to the vertical dimension?

In at least one other context (work scheduling) Putnam reported that the distance used for scheduling weight was max(dx, dy) + dz.  That's a little weird, but it would account for the vertical.
Logged

Putnam

  • Bay Watcher
  • DAT WIZARD
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #4 on: March 13, 2023, 08:23:11 pm »

It's "more than 26", not "at least 26", sorry.

Urist McNobody

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #5 on: March 13, 2023, 08:46:58 pm »

The expression max(dx, dy) + dz still defines an octahedron.  For each dwarf on level N, only a single tile would be <= 26 away on level N+26.  So shouldn't that have worked?  Or is the early-out defined a little differently, here?

My pitiable fort is about to get retired at 11-13fps despite 140 levels of vertical sprawl, so this question is especially important to me right now.
Logged

Putnam

  • Bay Watcher
  • DAT WIZARD
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #6 on: March 13, 2023, 10:44:21 pm »

That equation is totally unrelated to line of sight code. For LoS stuff it's "if any delta is more than 26, skip".

Razzim

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #7 on: March 14, 2023, 07:25:59 am »

Is there a typo?  Reading these two statements:

Quote
2. The most performance efficient embark for my testing conditions is 16x1 (and probably 1x16, but I haven't tested that one). It exploits the 26 tile skipping mechanism the most while still having relatively small amount of land to calculate/path around. With 1000 dwarves and their migrated animals on 16x1 embark i get around ~50 fps

Quote
3. 26 tile skipping doesn't work on z-dimension. On my recent test day I made 4 areas as in point I.5. separated by 26 z-levels, and spread 1000 dorfs on those areas resulting in roughly 250 dorfs on each of them, then removed stairs connecting them. FPS results are absolutely the same as with single 16x1 z-level.

If 16x1 checkerboarded was the best-performing configuration, and separating 4x occupied levels by 26 floors performed the same as just one level of a 16x1 embark then why do you say that the distance check doesn't work?  Doesn't that imply that the distance check is also sensitive to the vertical dimension?

In at least one other context (work scheduling) Putnam reported that the distance used for scheduling weight was max(dx, dy) + dz.  That's a little weird, but it would account for the vertical.

I... don't know what to think anymore, honestly. On one hand, all testing scenarios seemed to point at the 26+ tile skipping mechanism being the most fps profitable. But Putnam says that it works irrelevant to direction, so it should work on z-level too... yet I can't find a scenario where it does. If it doesn't work at all, then why 1x1 embark with 1000 dwarves on 16 different z-levels is so terrible, when it should be perfect for conventional LoS breaking? Im puzzled.

I prepared 2 scenarios with a lower amount of dwarves. I think that would be the fastest way for somebody to point a flaw in my testing methodology and understanding of the results.

Scenario1 - 500 dwarves on a single checkerboarded 16x1 level
https://dffd.bay12games.com/file.php?id=16525

Scenario2 - 500 dwarves spread on 4 different checkerboarded 16x1 levels
https://dffd.bay12games.com/file.php?id=16526

On both of them I get the same fluctuating 130-140 fps

Maybe... Maybe 140 fps on 500 dwarves, or 50 fps with 1000 dwarves is the maximum amount the game can handle with optimal conditions and better designs just can't improve it further?

I guess the next test I have to do is something like 4x1 embark, then adding 26+ separated z-levels to the point I achieve the performance of 16x1.
Logged

Urist McNobody

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #8 on: March 14, 2023, 08:32:49 am »

Putnam clarified in this thread that:  The bounding box for skipping the calculation is a cube, and that the early-out is for greater-than 26.  So dwarfs on level N can consider dwarfs on level N+26 for further evaluation, but not on level N+27.
Logged

Razzim

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #9 on: March 14, 2023, 08:39:58 am »

Putnam clarified in this thread that:  The bounding box for skipping the calculation is a cube, and that the early-out is for greater-than 26.  So dwarfs on level N can consider dwarfs on level N+26 for further evaluation, but not on level N+27.

With 26 empty z-levels between they have at least 27 levels difference, something like 1st area z-level 0, 2st area z-level 28. And there are certainly more than 26 levels apart between 1st and 3rd and 4th area.
Logged

Razzim

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #10 on: March 14, 2023, 09:04:08 am »

I've acctidentally made something in a shape of an ultimate pathing nightmare. 1000 dwarves constantly bumping into eachother, walking in circles and recalculating their paths. I think it would take about a century for a single dwarf to go from one end of 16x1 embark to the other side. But despite that, fps seems to be perfectly reasonable. 130 fps with 500 dorfs, 40 fps with 1000 dorfs. I think it confirms pretty well how insignificant pathing is for performance, considering that checkerboard hardly goes higher than 140fps/50fps. And the fps drop might very well be because of a bit worse LoS breaking than a checkerboard.

« Last Edit: March 14, 2023, 10:27:11 am by Razzim »
Logged

Robsoie

  • Bay Watcher
  • Urist McAngry
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #11 on: March 14, 2023, 10:50:17 am »

I guess it's hardcoded, but in case there's any hope i'll still ask : is that 26 tiles max requirement for line of sight checks editable/moddable ?
Logged

Putnam

  • Bay Watcher
  • DAT WIZARD
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #12 on: March 14, 2023, 12:36:03 pm »

If you'll believe it, that "pathing nightmare" there is significantly better for pathing performance than just mining out the whole Z could ever hope to be.

I guess it's hardcoded, but in case there's any hope i'll still ask : is that 26 tiles max requirement for line of sight checks editable/moddable ?

Technically it shouldn't be hard to binary patch, haha.

Panando

  • Bay Watcher
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #13 on: March 14, 2023, 12:55:20 pm »

If you'll believe it, that "pathing nightmare" there is significantly better for pathing performance than just mining out the whole Z could ever hope to be.


Incidentally, why's that? One would logically think that in the absence of obstacles A* should just find something pretty close to a straight line and be done with a minimum of CPU cycles. Is it because of the "high traffic" tile designations creating the possibility that there could be a less direct "high traffic" route forcing A* to look further?

edit: now I think about it, I can see how if there are multiple z-levels involved (pathing from one z-level to another) A* might have to hunt around a lot to find passes between the z-levels, if such passages are infrequent. Open areas would increase the search space. I'm still curious if the existence of high traffic designations makes all pathfinding more expensive, or if it's implemented using some hack.
« Last Edit: March 14, 2023, 05:54:27 pm by Panando »
Logged
Punch through a multi-z aquifer in under 5 minutes, video walkthrough. I post as /u/BlakeMW on reddit.

Robsoie

  • Bay Watcher
  • Urist McAngry
    • View Profile
Re: FPS Science and observations with 1000 dorfs population
« Reply #14 on: March 14, 2023, 01:04:56 pm »

I assume if the pathfinder is blocked by walls , it means it will "explore" only the corridors.
In fully open room it would explore the whole zone, meaning probably much more calculations
Logged
Pages: [1] 2 3