Topic: Proposal: a standard format for mods in a diff/patch Mod Starter Pack (Read 42368 times)

King Mir · « **Reply #225 on:** August 22, 2014, 06:45:37 am »

Quote from: PeridexisErrant on August 21, 2014, 11:21:47 pm

Code: [Select]
import os import difflib context_lines = 2 if os.path.isfile(mixed_raw_folder+file+'.patch'): os.remove(mixed_raw_folder+file+'.patch') for line in difflib.unified_diff(open(vanilla_raw_folder + file).readlines(), open(mod_raw_folder + file).readlines(), n=context_lines): with open(mixed_raw_folder+file+'.patch', 'a') as item: item.write(line)
Creating a unified patch file with a few lines of context (two lines matches within but not between objects) fixes the [pet] issue, but I can't work out how to apply a unified patch with python. Argh.

You can get difflib.Differ.unified_diff() to print out a unified diff, but merging isn't provided in the library.

Quote

Quote from: King Mir on August 21, 2014, 10:01:19 pm
<Python 3.x compatible version>
Well, it no longer freaks out about the print statement Unfortunately it also outputs the contents of the vanilla file

My testing, though I don't follow the various opcodes, shows that the output_file_temp returned by do_merge_seq() is the same as the contents of the vanilla file.

It should return 1 though. If it returns 1, then the output is garbage; it detected a conflict and gave up. It prints out vanilla because it got to the end of vanilla before finding a conflict. To see it returned 1, "echo $?" imediately after running it. You can run it like this:
python mergemod.py mod_file.txt vanilla_file.txt target_file.txt ; echo $?

But I need to test it more. See why it's complaining about maximum recursion depth to thistleknot, and test it more for correctness.

MagiX · « **Reply #226 on:** August 22, 2014, 06:56:57 am »

Spoiler (click to show/hide)

Quote from: King Mir on August 21, 2014, 08:10:23 pm

I'll reiterate this test case:
Code: (vanilla) [Select]
[CREATURE:GIANT_LEOPARD_GECKO] [COPY_TAGS_FROM:GECKO_LEOPARD] [APPLY_CREATURE_VARIATION:GIANT] [CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC] [APPLY_CURRENT_CREATURE_VARIATION] [GO_TO_END] [SELECT_CASTE:ALL] [CHANGE_BODY_SIZE_PERC:400700] [GO_TO_START] [NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [DESCRIPTION:A large monster in the shape of a gecko.] [POPULATION_NUMBER:10:20] [CLUSTER_NUMBER:1:1] [CREATURE_TILE:'G'] [COLOR:6:0:1] [PETVALUE:500] [MOUNT_EXOTIC] [GO_TO_END] [PREFSTRING:amazing sticky feet] [PREFSTRING:coloration] [APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph [APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
Code: (mod adding pet tag) [Select]
[CREATURE:GIANT_LEOPARD_GECKO] [COPY_TAGS_FROM:GECKO_LEOPARD] [APPLY_CREATURE_VARIATION:GIANT] [CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC] [APPLY_CURRENT_CREATURE_VARIATION] [GO_TO_END] [SELECT_CASTE:ALL] [CHANGE_BODY_SIZE_PERC:400700] [GO_TO_START] [NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [DESCRIPTION:A large monster in the shape of a gecko.] [POPULATION_NUMBER:10:20] [CLUSTER_NUMBER:1:1] [CREATURE_TILE:'G'] [COLOR:6:0:1] [PETVALUE:500] [MOUNT_EXOTIC] [GO_TO_END] [PREFSTRING:amazing sticky feet] [PREFSTRING:coloration] [APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph [APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [PET]
Code: (mod adding creature) [Select]
[CREATURE:GIANT_LEOPARD_GECKO] [COPY_TAGS_FROM:GECKO_LEOPARD] [APPLY_CREATURE_VARIATION:GIANT] [CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC] [APPLY_CURRENT_CREATURE_VARIATION] [GO_TO_END] [SELECT_CASTE:ALL] [CHANGE_BODY_SIZE_PERC:400700] [GO_TO_START] [NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [DESCRIPTION:A large monster in the shape of a gecko.] [POPULATION_NUMBER:10:20] [CLUSTER_NUMBER:1:1] [CREATURE_TILE:'G'] [COLOR:6:0:1] [PET_EXOTIC] [PETVALUE:500] [MOUNT_EXOTIC] [GO_TO_END] [PREFSTRING:amazing sticky feet] [PREFSTRING:coloration] [APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph [APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [CREATURE:DESERT TORTOISE] [DESCRIPTION:A tiny shelled reptile that lives in the desert.] [NAME:desert tortoise:desert tortoises:desert tortoise] [CASTE_NAME:desert tortoise:desert tortoises:desert tortoise] [CHILD:1][GENERAL_CHILD_NAME:desert tortoise hatchling:desert tortoise hatchlings] [CREATURE_TILE:'t'][COLOR:6:0:0] [PETVALUE:50] [BENIGN][NATURAL][PET_EXOTIC] [BIOME:ANY_DESERT] [LARGE_ROAMING] [POPULATION_NUMBER:10:30] [CLUSTER_NUMBER:1:1] [PREFSTRING:shells] [PREFSTRING:longevity] [CANNOT_JUMP]
Can these be merged? does the order or merging effect the result? I posit that it must not be allowed to be merged. Does diff3 merging allow it? Does stephan boyer's code?

I haven't read the entire thread, just the last few pages... quite a discussion going on here

What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:

Code: (vanilla) [Select]

vanilla_dict={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}}}

Code: (Pet) [Select]

Mod_1={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here,"PET":''}}}

Code: (Add animal) [Select]

Mod_2={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}},
"Desert_tortoise":{all the key/value pairs from new animal here}}

So for every key, one could check if the value (i.e. a new dict) is the same or not and if it is not the same, one can do this recursively. A simple 2 dict comparison can be found here
We thus have some options:

Stuff that is unchanged is copied to the mixed mod dict
Stuff that is simply added (as the [PET] tag or the new creature) will be added to the mixed mod dict
Stuff that is changed --> check if the same key/value pair is changed in both mods --> yes: problem; no: copy the change to the mixed mod dict
Stuff that is removed in the mod --> remove from mixed mod dict

and as a final step, one should parse the mixed mod dict into a file.

King Mir · « **Reply #227 on:** August 22, 2014, 07:42:27 am »

Quote from: MagiX on August 22, 2014, 06:56:57 am

Spoiler (click to show/hide)
Quote from: King Mir on August 21, 2014, 08:10:23 pm
I'll reiterate this test case:
Code: (vanilla) [Select]
[CREATURE:GIANT_LEOPARD_GECKO] [COPY_TAGS_FROM:GECKO_LEOPARD] [APPLY_CREATURE_VARIATION:GIANT] [CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC] [APPLY_CURRENT_CREATURE_VARIATION] [GO_TO_END] [SELECT_CASTE:ALL] [CHANGE_BODY_SIZE_PERC:400700] [GO_TO_START] [NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [DESCRIPTION:A large monster in the shape of a gecko.] [POPULATION_NUMBER:10:20] [CLUSTER_NUMBER:1:1] [CREATURE_TILE:'G'] [COLOR:6:0:1] [PETVALUE:500] [MOUNT_EXOTIC] [GO_TO_END] [PREFSTRING:amazing sticky feet] [PREFSTRING:coloration] [APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph [APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
Code: (mod adding pet tag) [Select]
[CREATURE:GIANT_LEOPARD_GECKO] [COPY_TAGS_FROM:GECKO_LEOPARD] [APPLY_CREATURE_VARIATION:GIANT] [CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC] [APPLY_CURRENT_CREATURE_VARIATION] [GO_TO_END] [SELECT_CASTE:ALL] [CHANGE_BODY_SIZE_PERC:400700] [GO_TO_START] [NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [DESCRIPTION:A large monster in the shape of a gecko.] [POPULATION_NUMBER:10:20] [CLUSTER_NUMBER:1:1] [CREATURE_TILE:'G'] [COLOR:6:0:1] [PETVALUE:500] [MOUNT_EXOTIC] [GO_TO_END] [PREFSTRING:amazing sticky feet] [PREFSTRING:coloration] [APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph [APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [PET]
Code: (mod adding creature) [Select]
[CREATURE:GIANT_LEOPARD_GECKO] [COPY_TAGS_FROM:GECKO_LEOPARD] [APPLY_CREATURE_VARIATION:GIANT] [CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC] [APPLY_CURRENT_CREATURE_VARIATION] [GO_TO_END] [SELECT_CASTE:ALL] [CHANGE_BODY_SIZE_PERC:400700] [GO_TO_START] [NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko] [DESCRIPTION:A large monster in the shape of a gecko.] [POPULATION_NUMBER:10:20] [CLUSTER_NUMBER:1:1] [CREATURE_TILE:'G'] [COLOR:6:0:1] [PET_EXOTIC] [PETVALUE:500] [MOUNT_EXOTIC] [GO_TO_END] [PREFSTRING:amazing sticky feet] [PREFSTRING:coloration] [APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph [APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph [CREATURE:DESERT TORTOISE] [DESCRIPTION:A tiny shelled reptile that lives in the desert.] [NAME:desert tortoise:desert tortoises:desert tortoise] [CASTE_NAME:desert tortoise:desert tortoises:desert tortoise] [CHILD:1][GENERAL_CHILD_NAME:desert tortoise hatchling:desert tortoise hatchlings] [CREATURE_TILE:'t'][COLOR:6:0:0] [PETVALUE:50] [BENIGN][NATURAL][PET_EXOTIC] [BIOME:ANY_DESERT] [LARGE_ROAMING] [POPULATION_NUMBER:10:30] [CLUSTER_NUMBER:1:1] [PREFSTRING:shells] [PREFSTRING:longevity] [CANNOT_JUMP]
Can these be merged? does the order or merging effect the result? I posit that it must not be allowed to be merged. Does diff3 merging allow it? Does stephan boyer's code?
I haven't read the entire thread, just the last few pages... quite a discussion going on here

What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:
Code: (vanilla) [Select]
vanilla_dict={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}}}
Code: (Pet) [Select]
Mod_1={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here,"PET":''}}}
Code: (Add animal) [Select]
Mod_2={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}}, "Desert_tortoise":{all the key/value pairs from new animal here}}So for every key, one could check if the value (i.e. a new dict) is the same or not and if it is not the same, one can do this recursively. A simple 2 dict comparison can be found here
We thus have some options:
Stuff that is unchanged is copied to the mixed mod dict
Stuff that is simply added (as the [PET] tag or the new creature) will be added to the mixed mod dict
Stuff that is changed --> check if the same key/value pair is changed in both mods --> yes: problem; no: copy the change to the mixed mod dict
Stuff that is removed in the mod --> remove from mixed mod dict
and as a final step, one should parse the mixed mod dict into a file.

Separating it into two levels like that does solve that particular problem. But you can't just use a dict, because you need to preserve the order of some tags.

But using json or xml may mean we can find a good diff/merge tool, that is aware of how the order of some things doesn't matter and some things do. XML is more powerful than JSON here, because the same xml tag can have both attributes which are unordered, and nested tags that are ordered.

PeridexisErrant · « **Reply #228 on:** August 22, 2014, 07:49:51 am »

Either sounds good, both are beyond my current skills.

Go for it, and I'll keep writing documentation and design ideas for stuff I can't code yet

MagiX · « **Reply #229 on:** August 22, 2014, 08:34:48 am »

Quote from: King Mir on August 22, 2014, 07:42:27 am

you need to preserve the order of some tags.

Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.

Quote from: PeridexisErrant on August 22, 2014, 07:49:51 am

beyond my current skills.

Time to learn sth new

I have no clue how to approach this either, but just thought about it and why not share my idea

King Mir · « **Reply #230 on:** August 22, 2014, 08:47:30 am »

Well the first step would be to find an xml merge tool. You might also try to write an XSLT script that does such a merge and properly identifies conflicting mergers; writing an XSLT script may be easier than writing a merge algorithm from scratch in python. Without such a tool, taking a round-trip through xml is pointless.

IMO, a from scratch python script that does merging on a 2+ level structure is probably the best way to go eventually.

Anyway, I'm going to keep working on my merge algorithm for now. And maybe add more boilerplate.

King Mir · « **Reply #231 on:** August 22, 2014, 08:52:25 am »

Quote from: MagiX on August 22, 2014, 08:34:48 am

Quote from: King Mir on August 22, 2014, 07:42:27 am
you need to preserve the order of some tags.
Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.

I'm not a modder, so I don't know the details, but some tags clearly suggest that order matters for them, like [GO_TO_END]. Other tags, like [PET] can be put anywhere after the creature token.

Quote from: PeridexisErrant on August 22, 2014, 07:49:51 am

Go for it, and I'll keep writing documentation and design ideas for stuff I can't code yet

Design is good. There's a lot of fairly strait forward stuff that needs to be done to manage everything. You need to be able to specify the list of mods. You need to be able to delete the output when merging fails. You probably want to figure out which two mods conflict when merging, which requires extra analysis. And of course the GUI -- designing and stubbing out the GUI can help plan what features you want even if they aren't immediately implemented.

Putnam · « **Reply #232 on:** August 22, 2014, 09:30:04 am »

Quote from: King Mir on August 22, 2014, 08:52:25 am

Quote from: MagiX on August 22, 2014, 08:34:48 am
Quote from: King Mir on August 22, 2014, 07:42:27 am
you need to preserve the order of some tags.
Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.
I'm not a modder, so I don't know the details, but some tags clearly suggest that order matters for them, like [GO_TO_END]. Other tags, like [PET] can be put anywhere after the creature token.

[GO_TO_END] is pretty much only there because castes are not declared at the start. Castes and not-creature tokens imbedded in creatures (I.E tissues and materials) are the only thing where positioning matterse.

thistleknot · « **Reply #233 on:** August 22, 2014, 11:19:18 am »

So I've been thinking. The reason we have these a (a being our base/common ancestor) vs b vs c token replacement between mods is what I would call "collisions" when trying to add to the same line (not replace but both are additive).

I was thinking if we somehow marked them in patch files in some post process. Maybe we can modify the changed token to be ]###newobject or ]#oldobjectToken

Since.I figured out how diff3 works I've been reusing kdiff3 in a whole new way.

Most of these collisions are resolved w inserting b before c or c before b. If we could somehow incorporate that logic as a post processor command in the patch files. Maybe an ]###add or ]#replace or ###Del... On each add of replace line (in a patch file).
we might be able to resolve this issue

Merkator · « **Reply #234 on:** August 22, 2014, 11:29:10 am »

I think we end up with full featured parser. Anyone here with some knowledge about Haskell and Parsec.

BTW I wrote my small diff parser and end up with 100 LOC.

I post it when I do some bugtesting and clean up this ~~piece of...~~ I mean beauty.

Dirst · « **Reply #235 on:** August 22, 2014, 11:40:28 am »

Quote from: Putnam on August 22, 2014, 09:30:04 am

Quote from: King Mir on August 22, 2014, 08:52:25 am
Quote from: MagiX on August 22, 2014, 08:34:48 am
Quote from: King Mir on August 22, 2014, 07:42:27 am
you need to preserve the order of some tags.
Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.
I'm not a modder, so I don't know the details, but some tags clearly suggest that order matters for them, like [GO_TO_END]. Other tags, like [PET] can be put anywhere after the creature token.

[GO_TO_END] is pretty much only there because castes are not declared at the start. Castes and not-creature tokens imbedded in creatures (I.E tissues and materials) are the only thing where positioning matterse.

There are four kinds of order dependence in the raws, with an example for each at the end.

1. The header (filename and OBJECT: declaration) need to come first in a file.
2. Variations need to be defined after the base creature.
3. Several tokens accept a list of subtokens to build a structure. The structure closes when the parser hits the first token that isn't a valid subtoken in that context.
4. Castes are a special case of 3. First, everything that appears before the first caste declaration is applied to ALL castes, nothing closes a caste structure except another caste declaration, and a caste declaration can be re-opened later in the same creature.

Example of 1: the creature_standard and [OBJECT:CREATURE] at the top of a file.
Examples of 2: a giant kea can't be defined until a kea is already defined, an olm man can't be defined before an olm (tiger men are an exception... they are made from scratch rather than being a tiger variant).
Examples of 3: each CREATURE sucks up all tags until it hits another CREATURE tag, and a SYNDROME sucks up all tags until it runs out of syndrome-defining or creature-effect-defining tags.
Example of 4: the intelligent creatures tend to have a lot of definition up front, briefly split into MALE and FEMALE castes, then select all castes again to finish up.

The easiest way to handle this is to hardcode in 1, and treat order within a top-level object as if it is critical to handle 3 and 4. Case 2 is the one that prevents us from alphabetizing things.

One way to handle that is to do a two-level sort. Base creatures are listed alphabetically, then all variations are listed alphabetically. The logic could be re-used later if we want to alphabetize gems within categories or something weird like that.

thistleknot · « **Reply #236 on:** August 22, 2014, 12:18:19 pm »

That is exactly the info we need to build a raw structure.

So...

I didn't have much luck tweaking my script to replace blank lines with [token] and then back-update with blank lines...

but... I did get the command down to 1 line in a for loop

ParseRawsv4a.bat

Code: [Select]

echo off
REM put tokens on their own line | REM remove tabs | remove all blanklines
for /f %%a in ('dir /b *.txt') do sed -e "s/\[[^][]*\]/\n&\n/g" %%~na.txt | sed -r "s/\t//g" | sed -e "s/^ *//; s/ *$//; /^$/d; s/\r//; /^\s*$/d" > %%~na.out |type %%~na.out
REM cleanup
ren *.out3 *.txt
erase *.out
echo on

I think this flatten should only be applied to the items within the [objects] folder. Things that affect speech and text seem to be read per line vs per token.

I was hoping to address the whitespace removal possibly affecting when two mods add tailing tokens at the end of objects. If the dictionary/match PE was trying to attempt, I assumed the relevant whitespace that trails any token additions at the end of objects would be relevant and NOT wish to be deleted, but idk. Either way, I had a bit of trouble

There is one way to do it, but then I'd have to use cygwin...
http://stackoverflow.com/questions/11393616/replace-string-that-contains-crlf

Button · « **Reply #237 on:** August 22, 2014, 12:44:44 pm »

Quote from: MagiX on August 22, 2014, 06:56:57 am

What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:

Hey guys, sorry for not keeping you up to date on my postprocessor, blah blah work blah blah food poisoning. What I have so far works a lot like this but without needing XML (yet).

So far what I have is code to read a raw file and parse it into raw objects. The code is on my home computer, but here's the pseudocode as I remember it:

Spoiler (click to show/hide)

Code: [Select]

for all files:
	for all lines:
		if the line is the first line:
			save line as current_file (required parameter for RawObjects)
		else if the line is a tag:
			parse the tag into a list of tokens
			if the first token is OBJECT:
				save the second token as current_object_type
				clear the current_object pointer
				clear the comments list
			else if the first token is current_object_type:
				if an object of type current_object_type and name tokens[1] exists in all_objects already:
					raise error
				else:
					create a new RawObject as current_object and add it to all_objects
					append stored_comments to current_object.body
					clear stored_comments
					append line to current_object.body
			else the line is a tag but not a creature or object token:
				if current_object exists:
					append stored_comments to current_object.body
					clear stored_comments
					append line to current_object.body for now; more meaningful handling later
				else:
					raise warning
					append line to stored_comments
		else the line is not a tag:
			append the line to the list stored_comments; it will be added to an current object whenever the next tag is written to an object.
finally:
	if stored_comments isn't empty:
		raise warning

The idea is, that we parse each mod into a collection of raw objects, indexed by object type and object name. This catches duplicate raws during loading.

It can easily be expanded into comparing each collection of raw objects to each other. Mods which add new, raw objects would be trivial with this setup. Mods which remove or make changes to existing objects would require additional handling, but there's plenty of room for it.

I was messing around with formats for defining legal raw objects of various types. Mainly what I found is that XML isn't great for it, because it doesn't deal gracefully for tags which are allowed in any order. Might be best to define a custom format if we want to go into it that far.

Merkator · « **Reply #238 on:** August 22, 2014, 12:59:52 pm »

Button: sound great. I thought myself about something like that.
It may be really much better way.
Storing object is not much problem.

But you remember about tag order and the whole [CASTE] thing.
What kind of data structure you use? List with tuple for each token or ordereddict with tuples or ordereddicts as values?

King Mir · « **Reply #239 on:** August 22, 2014, 01:05:35 pm »

Quote from: Merkator on August 22, 2014, 11:29:10 am

I think we end up with full featured parser. Anyone here with some knowledge about Haskell and Parsec.

Good point.

I have some experience with parsers and parser generators, but DF raws are so primitive that a parser generator seems overkill. There's very little grammar to DF raws, and checking the grammar is not important at all. On the other hand, maybe a lexer tool would be worthwile. Thistleknot, you might want to look into this: if there's a "lex" or lexer generator tool that compiles into python or a portable language. Maybe ANTLR, if it has a sufficiently documented Python generator. It might make reading and de-serializing raws easier.

News:

Author Topic: Proposal: a standard format for mods in a diff/patch Mod Starter Pack (Read 42368 times)

King Mir

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

MagiX

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

King Mir

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

PeridexisErrant

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

MagiX

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

King Mir

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

King Mir

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

Putnam

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

thistleknot

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

Merkator

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

Dirst

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

thistleknot

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

Button

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

Merkator

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack

King Mir

Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack