Managing change in an XML environment

Preserving entities and xincludes in the oxygen plugin

PosterContent
nk4um Administrator
Posts: 10
December 14, 2011 10:31

Hi David,

Thanks for your continued feedback, it is appreciated.

Adding oXygen track change output to the DITA and DocBook specific comparators is on our to-do list. We also have plans for addressing the comparison of two trees/maps of files and have done some work in this area. I will post a message on this forum when we have a more solid time table for the inclusion of these features.

Thanks for your suggestions about the CMS integration, we will take it into consideration for our future planning.

Anthony.

nk4um User
Posts: 2
December 10, 2011 22:06

Hi Mike, Rereading this thread and playing with the new version of the plugin, I see that I had missed some of the functionality of the plugin because I was only playing with the DocBook comparison. When I select "Compare XML" documents, indeed entities are preserved in the output document as well as the DOCTYPE declaration and all the declared entities. That's very impressive. Is it not possible to combine the DocBook revisonflag approach with roundtrip?

Regarding the xinclude/conref scenario, we're in Use Case #1 in that we manage our files in a source control system (git). The use cases I envision are:

1. Preparing a redline version of the document for reviewers. This is something we can do now with DeltaXML as is, since you just produce a new doc, generate output, and throw it away. I can imagine a super-fancy CMS that includes the server version of your project internally where you can pick two arbitrary versions of the file in a GUI and have it spit out a delta and then render that. That would be cool, but it's not what we're doing here.

2. Selectively merging in changes from a version of the source that was edited by someone using an editor other than Oxygen that pretty prints the whole doc (i.e. in a case where the source control system's merge capabilities cannot help us). I can also do that now using the "Compare XML" feature and opening each file independently. If the document has a large number of xincluded files, that would be inconvenient. If DeltaXML could reproduce the tree (or even overwrite the one of the sets of files), that would be convenient. Then the workflow would be something like this: a. Select the current file from the source control system as Input A b. Select the externally modified file as Input B. c. Overwrite Input A with the diff. This will overwrite all the files, creating new files and directories as necessary. The user is responsible for checking images. d. Accept and reject changes as appropriate. e. Commit the result back into the source control system.

Merging in extensive changes from a collection of files that have been "rewrapped" is a scenario I kind of dread. Oxygen itself is very smart in the way it pretty prints in that it only rewraps lines that were edited. Other wysioo editors are less graceful and rewrap the whole file.

I don't know how helpful any of that is to you. I'm just thinking out loud. I'll spend some more time with the tool now that I understand it better.

Thanks, David

nk4um Administrator
Posts: 10
August 30, 2011 11:47

David,

Thanks for your constructive posting concerning XML entities and xi:includes.

From the entities perspective, our XML to oXygen track change comparison scheme's default behaviour is to preserve (rather than expand) entity declarations and references, in the manner that you describe. Other behaviours can be selected via the comparison scheme's parameters, such as the ability to expand entity references before comparison.

From the xi:includes persepective, our XML to oXygen track change comparison scheme preserves the included elements. It would be straightforward to provide an option to expand the xi:include statements (in a manner similar to that provided by the DocBook revision flag comparison scheme). However, you were suggesting a structural comparison of two 'similar' documents, where included documents are 'recursively' compared. As you stated, this is in general a difficult problem to deal with. However, there may be some use cases that we can address. We have been considering a similar problem of how to compare DITA map files.

Use case 1: The document and its included source are contained in a file system directory structure. Different versions of the document are created by copying the directory structure and modifying the files. In this case the output of the comparison could be a directory structure that mirrors that of the source documents. Assuming there is only a moderate amount of change between the two documents, then it ought to be feasible to track it, and present these changes to the user.

Use case 2: The documents are held in a content management system (CMS). In this case, the output structure is less obvious, as you probably do not want to automatically create new tracked changed documents in the CMS. Instead, it might be possible to output a directory containing the tracked change documents, with additional markup to say where the source documents appeared in the CMS. This additional markup might then be used by a 'bespoke' tool to assit the user in adding new or updated entries into the CMS.

Overall, we are interested in addressing this structural comparsion issue. It is likely that progress in this area will be use case driven, as this will provide a context for making pragmatic decisions about how to align the documents for comparison and how to present the results. Therefore, suggestions of use cases that could be pragmatically adressed are of interest to us.

Thanks again for your feedback,

Anthony.

nk4um User
Posts: 2
July 8, 2011 16:36Preserving entities and xincludes in the oxygen plugin

Hi there, This is a continuation of a discussion from the Oxygen users list where I asked about preserving entities and xincludes.

Personally, I don't care too much about entities. I try to move people away from them whenever I can. Some folks around here still make limited use of them, but I plan to create alternatives. One problem with entities is that if you parse the doc, they're resolved. So if you want to munge your source with an xslt, you'd have to hide the entities first, then run the xslt, then add the DOCTYPE back. Blech.

xincludes, however, I do use. Your Oxygen plugin is pretty cool and useful as it is. I'm not sure if what I want regarding xincludes is really feasible, but I'll go ahead and describe it:

I think my dream version of this tool would create the diffs while leaving the structure of the source intact. So if you have a wrapper file with chapters pulled in as xincludes and you diff two versions of the wrapper files, you would want the result preserve the wrapper/chapter file structure. I can imagine some difficult situations for your tool to deal with. For example, what if between versions of the document, some content was left unchanged but moved into a separate file that is xincluded in. When creating the result, which structure would your tool keep? Some users might make extensive use of xinclude to pull in small fragments. The result in that case might be hundreds of small files. What if some of the xincluded files are remote? I think the only way to do what I'm imagining would be for the tool to write out the result to the file system and then open it in a new oxygen tab. I think you would have to ask the user whether to keep the file structure of doc A or B. Like I said, this may not be feasible or worth doing. If you've broken your doc into chapters you could just diff each chapter in DeltaXML. If you're doing something more granular, that may not be fun.

Even if none of that is possible, I think the plugin will be useful to many users as-is. Congratulations on creating it!

David