I’m finally getting a chance to look seriously at the standoff ODD and so I wanted to pull together my own notes and get some discussion going here that we can feed back to the proposers. 1) Why such an elaborate header? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...) I can understand wanting to have some description of the content of <stdf> (or <ldb> as Syd proposed—I happen to like that better), but this seems maybe a bit overboard. Why can’t the document’s own revisionDesc serve to document revisions to the standoff section, for example? 2) Since standoff elements can nest, why do we need the <annotations> element? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Wouldn’t it be simpler to allow the standoff element just to contain a heading and its content, optionally including more standoff elements? 3) I can sort of guess what <mapStruct> might be for, but it isn’t very clear (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Do we not already have sufficient mechanisms for attaching standoff markup to its targets? That’s all I’ve got for now. What do you think? We’re going to beam Laurent in for a discussion at the F2F on Friday the 29th at 10:00.
Am 06.05.2015 um 14:46 schrieb Hugh Cayless
: I’m finally getting a chance to look seriously at the standoff ODD and so I wanted to pull together my own notes and get some discussion going here that we can feed back to the proposers.
1) Why such an elaborate header? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...) I can understand wanting to have some description of the content of <stdf> (or <ldb> as Syd proposed—I happen to like that better), but this seems maybe a bit overboard. Why can’t the document’s own revisionDesc serve to document revisions to the standoff section, for example? I think we took biblFull as a model and removed some parts and added others. Generally, stdf can come in two flavors: standalone, i.e. stdf is the only resourceLike and there is no text. Then the teiHeader contains the meta data about the stdf element. (That’s why the soHeader is optional.) When stdf js attached to some resourceLike and/or some text, then we need some elaborate meta data container for the stdf element since it might come from some very different processing chain or being reused or … I don’t think that’s one of the week spots of the proposal ;)
2) Since standoff elements can nest, why do we need the <annotations> element? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Wouldn’t it be simpler to allow the standoff element just to contain a heading and its content, optionally including more standoff elements? Grouping and nesting is something that could be achieved in other ways, I agree. But I guess Laurent will have an argument to make!
3) I can sort of guess what <mapStruct> might be for, but it isn’t very clear (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Do we not already have sufficient mechanisms for attaching standoff markup to its targets? We were thinking of providing a mechanism to e.g. encode poems standoff. So we wanted to create the structure (lg/l etc) within annotation while only pointing at the text. I’m not aware of any existing mechanism?
Best Peter
Le 6 mai 2015 à 15:28, Peter Stadler
a écrit : Am 06.05.2015 um 14:46 schrieb Hugh Cayless
: I’m finally getting a chance to look seriously at the standoff ODD and so I wanted to pull together my own notes and get some discussion going here that we can feed back to the proposers.
1) Why such an elaborate header? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...) I can understand wanting to have some description of the content of <stdf> (or <ldb> as Syd proposed—I happen to like that better), but this seems maybe a bit overboard. Why can’t the document’s own revisionDesc serve to document revisions to the standoff section, for example? I think we took biblFull as a model and removed some parts and added others. Generally, stdf can come in two flavors: standalone, i.e. stdf is the only resourceLike and there is no text. Then the teiHeader contains the meta data about the stdf element. (That’s why the soHeader is optional.) When stdf js attached to some resourceLike and/or some text, then we need some elaborate meta data container for the stdf element since it might come from some very different processing chain or being reused or … I don’t think that’s one of the week spots of the proposal ;)
Indeed, one should really see annotations as a quite autonomous component of a document, which can bear a lot of specific meta-data related to persons or tools who have been responsable for the content, profile information (e.g. tagset, annotation vocabulary in general) or versioning. If you see the stdf content as additional layers complementing an existing document, you may just not want to clutter the main header with the corresponding stuff. A quick word concerning the name, please avoid making the hype reference to linked open data here. The presence of stand-off annotations refer to around 20 years of works, projects, what have you (see e.g. http://www.ltg.ed.ac.uk/~ht/sgmleu97.html http://www.ltg.ed.ac.uk/~ht/sgmleu97.html) and correspond to something which is deeply anchored in the linguistic corpora community in particular. A mail is not a place to give a complete historical background, but we missed a couple of opportunities in this direction in the past and choosing a clear appellation that reflects the concept (std, standoff, stand-off, stdfAnnotations, what have you) would be more sustainable and legible than a LOD based appellation (and for LOD, yes, you don’t need a header, you don’t need recursivity of content, etc. wong message all in all).
2) Since standoff elements can nest, why do we need the <annotations> element? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Wouldn’t it be simpler to allow the standoff element just to contain a heading and its content, optionally including more standoff elements? Grouping and nesting is something that could be achieved in other ways, I agree. But I guess Laurent will have an argument to make!
<annotations> reflect <text> in a normal document. At the workshop last year we did not want to have annotation content just thrown in after the header. The main argument I would see is encoding elegance here, but I am conscious that it may not be the most technical one. But yes, just imagine dropping <text> in a TEI document (and even <body>) :-}
3) I can sort of guess what <mapStruct> might be for, but it isn’t very clear (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Do we not already have sufficient mechanisms for attaching standoff markup to its targets? We were thinking of providing a mechanism to e.g. encode poems standoff. So we wanted to create the structure (lg/l etc) within annotation while only pointing at the text. I’m not aware of any existing mechanism?
This is something we devised during the workshop and I never came across a clear usage scenario/example/spec for this… This could be easily added afterwards if we hear about a strong requirement from the community. @Peter: shall we drop this at this stage?
Best Peter
Laurent Romary INRIA laurent.romary@inria.fr
Regarding mapStruct: I don’t mind dropping it. But we could ask Andreas Witt or others how strong they feel about it. Maybe he’s got some good example supporting it?! Cheers Peter
Am 08.05.2015 um 13:49 schrieb Laurent Romary
: Le 6 mai 2015 à 15:28, Peter Stadler
a écrit : Am 06.05.2015 um 14:46 schrieb Hugh Cayless
: I’m finally getting a chance to look seriously at the standoff ODD and so I wanted to pull together my own notes and get some discussion going here that we can feed back to the proposers.
1) Why such an elaborate header? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...) I can understand wanting to have some description of the content of <stdf> (or <ldb> as Syd proposed—I happen to like that better), but this seems maybe a bit overboard. Why can’t the document’s own revisionDesc serve to document revisions to the standoff section, for example? I think we took biblFull as a model and removed some parts and added others. Generally, stdf can come in two flavors: standalone, i.e. stdf is the only resourceLike and there is no text. Then the teiHeader contains the meta data about the stdf element. (That’s why the soHeader is optional.) When stdf js attached to some resourceLike and/or some text, then we need some elaborate meta data container for the stdf element since it might come from some very different processing chain or being reused or … I don’t think that’s one of the week spots of the proposal ;)
Indeed, one should really see annotations as a quite autonomous component of a document, which can bear a lot of specific meta-data related to persons or tools who have been responsable for the content, profile information (e.g. tagset, annotation vocabulary in general) or versioning. If you see the stdf content as additional layers complementing an existing document, you may just not want to clutter the main header with the corresponding stuff.
A quick word concerning the name, please avoid making the hype reference to linked open data here. The presence of stand-off annotations refer to around 20 years of works, projects, what have you (see e.g. http://www.ltg.ed.ac.uk/~ht/sgmleu97.html) and correspond to something which is deeply anchored in the linguistic corpora community in particular. A mail is not a place to give a complete historical background, but we missed a couple of opportunities in this direction in the past and choosing a clear appellation that reflects the concept (std, standoff, stand-off, stdfAnnotations, what have you) would be more sustainable and legible than a LOD based appellation (and for LOD, yes, you don’t need a header, you don’t need recursivity of content, etc. wong message all in all).
2) Since standoff elements can nest, why do we need the <annotations> element? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Wouldn’t it be simpler to allow the standoff element just to contain a heading and its content, optionally including more standoff elements? Grouping and nesting is something that could be achieved in other ways, I agree. But I guess Laurent will have an argument to make!
<annotations> reflect <text> in a normal document. At the workshop last year we did not want to have annotation content just thrown in after the header. The main argument I would see is encoding elegance here, but I am conscious that it may not be the most technical one. But yes, just imagine dropping <text> in a TEI document (and even <body>) :-}
3) I can sort of guess what <mapStruct> might be for, but it isn’t very clear (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Do we not already have sufficient mechanisms for attaching standoff markup to its targets? We were thinking of providing a mechanism to e.g. encode poems standoff. So we wanted to create the structure (lg/l etc) within annotation while only pointing at the text. I’m not aware of any existing mechanism?
This is something we devised during the workshop and I never came across a clear usage scenario/example/spec for this… This could be easily added afterwards if we hear about a strong requirement from the community. @Peter: shall we drop this at this stage?
Best Peter
Laurent Romary INRIA laurent.romary@inria.fr
You’re right. Let me write to Andreas and Piotr and see if they are motivated, in which case they could also provide content for the proposal. Cheers, Laurent `
Le 8 mai 2015 à 14:21, Peter Stadler
a écrit : Regarding mapStruct: I don’t mind dropping it. But we could ask Andreas Witt or others how strong they feel about it. Maybe he’s got some good example supporting it?!
Cheers Peter
Am 08.05.2015 um 13:49 schrieb Laurent Romary
: Le 6 mai 2015 à 15:28, Peter Stadler
a écrit : Am 06.05.2015 um 14:46 schrieb Hugh Cayless
: I’m finally getting a chance to look seriously at the standoff ODD and so I wanted to pull together my own notes and get some discussion going here that we can feed back to the proposers.
1) Why such an elaborate header? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...) I can understand wanting to have some description of the content of <stdf> (or <ldb> as Syd proposed—I happen to like that better), but this seems maybe a bit overboard. Why can’t the document’s own revisionDesc serve to document revisions to the standoff section, for example? I think we took biblFull as a model and removed some parts and added others. Generally, stdf can come in two flavors: standalone, i.e. stdf is the only resourceLike and there is no text. Then the teiHeader contains the meta data about the stdf element. (That’s why the soHeader is optional.) When stdf js attached to some resourceLike and/or some text, then we need some elaborate meta data container for the stdf element since it might come from some very different processing chain or being reused or … I don’t think that’s one of the week spots of the proposal ;)
Indeed, one should really see annotations as a quite autonomous component of a document, which can bear a lot of specific meta-data related to persons or tools who have been responsable for the content, profile information (e.g. tagset, annotation vocabulary in general) or versioning. If you see the stdf content as additional layers complementing an existing document, you may just not want to clutter the main header with the corresponding stuff.
A quick word concerning the name, please avoid making the hype reference to linked open data here. The presence of stand-off annotations refer to around 20 years of works, projects, what have you (see e.g. http://www.ltg.ed.ac.uk/~ht/sgmleu97.html) and correspond to something which is deeply anchored in the linguistic corpora community in particular. A mail is not a place to give a complete historical background, but we missed a couple of opportunities in this direction in the past and choosing a clear appellation that reflects the concept (std, standoff, stand-off, stdfAnnotations, what have you) would be more sustainable and legible than a LOD based appellation (and for LOD, yes, you don’t need a header, you don’t need recursivity of content, etc. wong message all in all).
2) Since standoff elements can nest, why do we need the <annotations> element? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Wouldn’t it be simpler to allow the standoff element just to contain a heading and its content, optionally including more standoff elements? Grouping and nesting is something that could be achieved in other ways, I agree. But I guess Laurent will have an argument to make!
<annotations> reflect <text> in a normal document. At the workshop last year we did not want to have annotation content just thrown in after the header. The main argument I would see is encoding elegance here, but I am conscious that it may not be the most technical one. But yes, just imagine dropping <text> in a TEI document (and even <body>) :-}
3) I can sort of guess what <mapStruct> might be for, but it isn’t very clear (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Do we not already have sufficient mechanisms for attaching standoff markup to its targets? We were thinking of providing a mechanism to e.g. encode poems standoff. So we wanted to create the structure (lg/l etc) within annotation while only pointing at the text. I’m not aware of any existing mechanism?
This is something we devised during the workshop and I never came across a clear usage scenario/example/spec for this… This could be easily added afterwards if we hear about a strong requirement from the community. @Peter: shall we drop this at this stage?
Best Peter
Laurent Romary INRIA laurent.romary@inria.fr
Laurent Romary INRIA laurent.romary@inria.fr
PS: I’ve spent two days this week within a European organization managing around 100 million documents in TEI :-} and waiting for this component to be in to integrate the work of their staff, whose activity produces quite a bunch of annotations on the very same documents. We may explode our usage statistics right at the launch of <stdf>…
Le 6 mai 2015 à 15:28, Peter Stadler
a écrit : Am 06.05.2015 um 14:46 schrieb Hugh Cayless
: I’m finally getting a chance to look seriously at the standoff ODD and so I wanted to pull together my own notes and get some discussion going here that we can feed back to the proposers.
1) Why such an elaborate header? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...) I can understand wanting to have some description of the content of <stdf> (or <ldb> as Syd proposed—I happen to like that better), but this seems maybe a bit overboard. Why can’t the document’s own revisionDesc serve to document revisions to the standoff section, for example? I think we took biblFull as a model and removed some parts and added others. Generally, stdf can come in two flavors: standalone, i.e. stdf is the only resourceLike and there is no text. Then the teiHeader contains the meta data about the stdf element. (That’s why the soHeader is optional.) When stdf js attached to some resourceLike and/or some text, then we need some elaborate meta data container for the stdf element since it might come from some very different processing chain or being reused or … I don’t think that’s one of the week spots of the proposal ;)
2) Since standoff elements can nest, why do we need the <annotations> element? (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Wouldn’t it be simpler to allow the standoff element just to contain a heading and its content, optionally including more standoff elements? Grouping and nesting is something that could be achieved in other ways, I agree. But I guess Laurent will have an argument to make!
3) I can sort of guess what <mapStruct> might be for, but it isn’t very clear (see https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff... https://github.com/laurentromary/stdfSpec/blob/master/Specification/standoff...). Do we not already have sufficient mechanisms for attaching standoff markup to its targets? We were thinking of providing a mechanism to e.g. encode poems standoff. So we wanted to create the structure (lg/l etc) within annotation while only pointing at the text. I’m not aware of any existing mechanism?
Best Peter
Laurent Romary INRIA laurent.romary@inria.fr
participants (3)
-
Hugh Cayless
-
Laurent Romary
-
Peter Stadler