Hi all, I have a vague memory of this idea being raised at some point, but I can't find any evidence of it, so maybe it was all in my head. After three attempts to get translation projects going have failed almost at the starting gate, it seems to me that we need an alternative strategy. Since the Members Meeting will be in France next year, how about organizing a one-day "translate-a-thon" to update the French translation? We could provide food, organization and guidance, and try to get a large enough group of interested French speakers to storm through as much of the French translation job as can be done in a day. We might get it all done. Even if we don't, I bet we could get far enough with it that the job would get finished over the following weeks. What do you think? Cheers, Martin
Martin, this sounds like a great idea - definitely in French and perhaps on
the fringe of that group, in another one or two languages -Italian? . It
could even have a remote component - people can join in via hangouts and
edit their language. I was thinking, even before you emailled me to ask
about the progress of the Greek translation, that what is needed is some
kind if group excitement and support. I'd been thinking that I should mail
Anna Maria and see if she wanted to put together a virtual editathon at
some time. This can still happen, but the event at the members meeting
would be great. Also an other way to involve the people who are TEI savvy
but not super technical.
But - the discussion about the translations also had some points about how
to re-incorporate translations into the Guidelines and also how to keep
them up to date. Is this still an issue? It would be neat to have the
translations quickly available.
The Greek translation group were very enthusiastic, but haven't made much
if any progress.
--elli
On Tue, Dec 23, 2014 at 11:15 PM, Martin Holmes
Hi all,
I have a vague memory of this idea being raised at some point, but I can't find any evidence of it, so maybe it was all in my head. After three attempts to get translation projects going have failed almost at the starting gate, it seems to me that we need an alternative strategy. Since the Members Meeting will be in France next year, how about organizing a one-day "translate-a-thon" to update the French translation? We could provide food, organization and guidance, and try to get a large enough group of interested French speakers to storm through as much of the French translation job as can be done in a day.
We might get it all done. Even if we don't, I bet we could get far enough with it that the job would get finished over the following weeks. What do you think?
Cheers, Martin -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
Hi Elli, On 14-12-24 06:16 AM, Mylonas, Elli wrote:
Martin, this sounds like a great idea - definitely in French and perhaps on the fringe of that group, in another one or two languages -Italian? . It could even have a remote component - people can join in via hangouts and edit their language.
I hadn't thought about doing other languages, to be honest; I just thought that since we'll be in France, and presumably we'll have the largest ever group of native French TEI folks on hand, we should focus on that. But extending it to other languages virtually would be great.
I was thinking, even before you emailled me to ask about the progress of the Greek translation, that what is needed is some kind if group excitement and support. I'd been thinking that I should mail Anna Maria and see if she wanted to put together a virtual editathon at some time. This can still happen, but the event at the members meeting would be great. Also an other way to involve the people who are TEI savvy but not super technical.
Yes, I was thinking we'd try to pair people up in such a way that each pair have between them some good TEI expertise and familiarity with the Guidelines, some technical skills, and some translation experience. Then we could assign each pair to start translating at a particular point in the spreadsheet, so they don't overlap. Anyone finishing their block can be assigned another block. Each completed block should also be reviewed by another team before acceptance.
But - the discussion about the translations also had some points about how to re-incorporate translations into the Guidelines and also how to keep them up to date. Is this still an issue? It would be neat to have the translations quickly available.
Yes, definitely. What I'd plan to do is to have the automated re-integration code ready to go -- it won't be very complicated, it just needs writing, and this would be a good spur for me to do that. Then I could run that code during the lunch break, and everyone coming back after lunch should be able to see their translations already working in the Jenkins builds.
The Greek translation group were very enthusiastic, but haven't made much if any progress.
That's the thing: it's such a daunting task that people never really get started. I think I'd also want to get a couple of people involved ahead of time to translate (say) 25 items before the workshop, just to get us started, so that initial hump is over. If we could manage to make this work just once, I think it would be much easier to get other events organized wherever there are enough people gathered together. Once people see that there's a fully updated French translation, they'll be more willing to believe it's possible to do their own language. Cheers, Martin
--elli
On Tue, Dec 23, 2014 at 11:15 PM, Martin Holmes
wrote: Hi all,
I have a vague memory of this idea being raised at some point, but I can't find any evidence of it, so maybe it was all in my head. After three attempts to get translation projects going have failed almost at the starting gate, it seems to me that we need an alternative strategy. Since the Members Meeting will be in France next year, how about organizing a one-day "translate-a-thon" to update the French translation? We could provide food, organization and guidance, and try to get a large enough group of interested French speakers to storm through as much of the French translation job as can be done in a day.
We might get it all done. Even if we don't, I bet we could get far enough with it that the job would get finished over the following weeks. What do you think?
Cheers, Martin -- tei-council mailing list tei-council@lists.tei-c.org http://lists.lists.tei-c.org/mailman/listinfo/tei-council
PLEASE NOTE: postings to this list are publicly archived
On 24/12/14 04:15, Martin Holmes wrote:
Since the Members Meeting will be in France next year, how about organizing a one-day "translate-a-thon" to update the French translation? We could provide food, organization and guidance, and try to get a large enough group of interested French speakers to storm through as much of the French translation job as can be done in a day.
This does sound like a good idea, though it will need some careful preparation I think. Maybe we should bounce the idea off the local organizers first? There is the tei-fr list which could also be used to drum up interest. One thing I think would make the job easier is a database of existing translation pairs. The language of our descriptions is fairly repetitive, with a number of set phrases which we would hope to see being translated consistently. A display like that of http://www.linguee.fr/anglais-francais/traduction/translation+memory.html but using just the relevant bits of P5 as corpus would be fairly easy to build directly from the existing source, I would have thought.
On 14-12-29 08:06 AM, Lou Burnard wrote:
On 24/12/14 04:15, Martin Holmes wrote:
Since the Members Meeting will be in France next year, how about organizing a one-day "translate-a-thon" to update the French translation? We could provide food, organization and guidance, and try to get a large enough group of interested French speakers to storm through as much of the French translation job as can be done in a day.
This does sound like a good idea, though it will need some careful preparation I think. Maybe we should bounce the idea off the local organizers first? There is the tei-fr list which could also be used to drum up interest.
Could you post there to find out if there would be interest? Or should we contact the local organizers first, as a matter of courtesy?
One thing I think would make the job easier is a database of existing translation pairs. The language of our descriptions is fairly repetitive, with a number of set phrases which we would hope to see being translated consistently. A display like that of http://www.linguee.fr/anglais-francais/traduction/translation+memory.html but using just the relevant bits of P5 as corpus would be fairly easy to build directly from the existing source, I would have thought.
I reckon we could use similarity matching to provide, for each row in the spreadsheet, the nearest couple of matches to the English source for which valid translations already exist. Then we could regenerate that at break time, after a bunch of work had been done, to provide new content from things translated up to that point. Cheers, Martin
Carved in stone on my iPad
On 29 Dec 2014, at 16:06, Lou Burnard
wrote: One thing I think would make the job easier is a database of existing translation pairs. The language of our descriptions is fairly repetitive, with a number of set phrases which we would hope to see being translated consistently. A display like that of http://www.linguee.fr/anglais-francais/traduction/translation+memory.html but using just the relevant bits of P5 as corpus would be fairly easy to build directly from the existing source, I would have
I considered this at one time, but my heart failed me at the thought of the work involved in parsing all the texts, identifying the sentence structure, finding the translated pairs, allowing for different structure u.s.w. It doesn't seem to me like a job for amateurs, so I for one reluctantly stand aside. Lou, how about asking our mutual friend Ana Frankenberg to advise on strategy? She might know of people who have run such events. Sebastian
On 29/12/14 22:55, Sebastian Rahtz wrote:
Carved in stone on my iPad
On 29 Dec 2014, at 16:06, Lou Burnard
wrote: One thing I think would make the job easier is a database of existing translation pairs. The language of our descriptions is fairly repetitive, with a number of set phrases which we would hope to see being translated consistently. A display like that of http://www.linguee.fr/anglais-francais/traduction/translation+memory.html but using just the relevant bits of P5 as corpus would be fairly easy to build directly from the existing source, I would have I considered this at one time, but my heart failed me at the thought of the work involved in parsing all the texts, identifying the sentence structure, finding the translated pairs, allowing for different structure u.s.w.
But you don't need to do that. The translated pairs are already there. All you need is to index them so that I can look up a word or phrase in (say) the English, and see each desc in which it appears, along with the corresponding translation in (say) French for that desc.
On 29 Dec 2014, at 23:13, Lou Burnard
wrote: But you don't need to do that. The translated pairs are already there. All you need is to index them so that I can look up a word or phrase in (say) the English, and see each desc in which it appears, along with the corresponding translation in (say) French for that desc.
Yes, exactly. It was the thought of implementing just this which made me a quail, finding corresponding words and phrases and indexing them. It's probably easier than I think. Am not tempted, though :-) Sebastian
who knew that there are are 78 identical <gloss> or <desc> in the TEI? what joy. abbreviated form of title abbreviation alternate attributes certainty character columns content model damage results from mildew on the leaf surface damage results from rubbing of the leaf edges damage results from smoke days descriptive docbook duration enjambement exclusive full form gender groups information relating to one homograph within an entry. homograph hours identifier identifies the text types or classifications applicable to this item by pointing to other elements or resources defining the classification concerned. inclusive indicates that the alternation is exclusive, i.e. that at most one of the alternatives occurs. indicates that the alternation is not exclusive, i.e. that one or more of the alternatives occur. indicates the length of this element in time. indicates whether or not the phenomenon is repeated. language maximum number of occurences milliseconds minimum number of occurences minutes namespace number optional ordered organization paragraph content part of speech partial performance prefix provides attributes for recording normalized temporal durations. reference regular expression pattern responsibility seconds signature specifies the earliest possible date for the event in standard form, e.g. yyyy-mm-dd. specifies the effect of this declaration on its parent module. specifies the latest possible date for the event in standard form, e.g. yyyy-mm-dd. states whether the alternations gathered in this collection are exclusive or inclusive. subordinate suffix synonym text encoding initiative this declaration changes the declaration of the same name in the current definition this declaration is added to the current definitions this declaration replaces the declaration of the same name in the current definition uniform resource locator unknown witness or witnesses Sevastian
On 30/12/14 13:21, Sebastian Rahtz wrote:
who knew that there are are 78 identical <gloss> or <desc> in the TEI? what joy.
abbreviated form of title abbreviation alternate attributes
I think I may speak for others when I say in response to this list "che?" wot are you on about? "identical" to what? Do you just mean that the phrases on your list are all used more than once?
On 30 Dec 2014, at 15:22, Lou Burnard
wrote: On 30/12/14 13:21, Sebastian Rahtz wrote:
who knew that there are are 78 identical <gloss> or <desc> in the TEI? what joy.
abbreviated form of title abbreviation alternate attributes
I think I may speak for others when I say in response to this list "che?" wot are you on about? "identical" to what? Do you just mean that the phrases on your list are all used more than once?
yes. well, not phrases, whole descriptions. I took all the <desc> and <ident> contents, and a sort -u on them, finding 78 duplicates. eg title.xml: <desc versionDate="2007-06-27" xml:lang="en">abbreviated form of title</desc> titlePart.xml: <desc versionDate="2007-06-27" xml:lang="en">abbreviated form of title</desc> both of which come from possible values for @type. makes you think. not a lot, but a bit. Sebastian
On 14-12-30 04:28 AM, Sebastian Rahtz wrote:
On 29 Dec 2014, at 23:13, Lou Burnard
wrote: But you don't need to do that. The translated pairs are already there. All you need is to index them so that I can look up a word or phrase in (say) the English, and see each desc in which it appears, along with the corresponding translation in (say) French for that desc.
Yes, exactly. It was the thought of implementing just this which made me a quail, finding corresponding words and phrases and indexing them. It's probably easier than I think. Am not tempted, though :-)
While it might be practical, I don't think indexing by word is really going to be helpful. The same word in English in two given contexts is not necessarily best translated by the same word in French in each case. But I agree that one thing we could provide in advance is a standard vocabulary of terms and their French equivalents. Cheers, Martin
Sebastian
On 14-12-29 02:55 PM, Sebastian Rahtz wrote:
Carved in stone on my iPad
On 29 Dec 2014, at 16:06, Lou Burnard
wrote: One thing I think would make the job easier is a database of existing translation pairs. The language of our descriptions is fairly repetitive, with a number of set phrases which we would hope to see being translated consistently. A display like that of http://www.linguee.fr/anglais-francais/traduction/translation+memory.html but using just the relevant bits of P5 as corpus would be fairly easy to build directly from the existing source, I would have
I considered this at one time, but my heart failed me at the thought of the work involved in parsing all the texts, identifying the sentence structure, finding the translated pairs, allowing for different structure u.s.w.
It doesn't seem to me like a job for amateurs, so I for one reluctantly stand aside.
I have a working implementation of the Universal Similarity Metric in XQuery that runs in eXist. It would be relatively easy to identify similar sentences and phrases. The problem is the combinatorial explosion; every phrase has to be compared with every other phrase to find the closest matches. We have over 1700 strings that need translating, which works out at nearly 300,000 comparisons. Those would have to be run in advance, to identify (say) for each individual string the nearest three matches by similarity in English; once you knew those, you could easily copy any of those which had already been translated into a cell in the row, so the translator of each row would have access to some examples using related terms. That would go some way towards encouraging consistency. It would also allow a translator who believed their translation was better than one of the related ones to go back and suggest a better translation to replace an existing one. Cheers, Martin
Lou, how about asking our mutual friend Ana Frankenberg to advise on strategy? She might know of people who have run such events.
Sebastian
participants (4)
-
Lou Burnard
-
Martin Holmes
-
Mylonas, Elli
-
Sebastian Rahtz