|
|
-
Hadoop´s Internationalization
Marcos Ortiz 2011-06-25, 02:55
Regards to all the list. I�m looking for a proper way to work on the internationalization of Hadoop, but I don�t know if this is a good project or if this is useful for the community, at least, I think that it would be very useful for many people that want to see the messages of the project in another language, for example, in Spanish. I would like to work on this, but I don�t know how to start on this. I think that a first step is to translate the message of the commands more used by administrators and developers of Hadoop: (The Hadoop fs shell, for example). It would be very useful to have on the source code, a directory for earch language�s messages, but I don�t know if this could carry out many changes on the main API. What do you think about this? Thanls for your time -- Marcos Lu�s Ort�z Valmaseda Software Engineer (UCI) http://marcosluis2186.posterous.com http://twitter.com/marcosluis2186
+
Marcos Ortiz 2011-06-25, 02:55
-
Re: Hadoop´s Internationalization
Owen O'Malley 2011-06-25, 08:38
On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote:
> Regards to all the list. > I´m looking for a proper way to work on the internationalization of Hadoop, > but I don´t know if this is a good project > or if this is useful for the community, at least, I think that it would be > very useful for many people that want to see the messages > of the project in another language, for example, in Spanish. I think it would be very useful, but very time consuming project. I'd suggest that you start by translating the documentation for the release into another language first.
-- Owen
+
Owen O'Malley 2011-06-25, 08:38
-
Re: Hadoop´s Internationalization
Marcos Ortiz 2011-06-25, 15:17
OK, Owen, Where I can find the sources of the docs? Which is the format for the docs? DocBook, ReST, etc? El 6/25/2011 4:38 AM, Owen O'Malley escribió: > On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortiz<[EMAIL PROTECTED]> wrote: > > >> Regards to all the list. >> I´m looking for a proper way to work on the internationalization of Hadoop, >> but I don´t know if this is a good project >> or if this is useful for the community, at least, I think that it would be >> very useful for many people that want to see the messages >> of the project in another language, for example, in Spanish. >> > > I think it would be very useful, but very time consuming project. I'd > suggest that you start by translating the documentation for the release into > another language first. > > -- Owen > > -- Marcos Luís Ortíz Valmaseda Software Engineer (UCI) http://marcosluis2186.posterous.com http://twitter.com/marcosluis2186
+
Marcos Ortiz 2011-06-25, 15:17
-
Re: Hadoop´s Internationalization
Ted Yu 2011-06-25, 16:08
Marcos: Which hadoop version(s) do you plan to work on ? >> Where I can find the sources of the docs? In the latest TRUNK, you would find these directories: ./common/src/docs ./hdfs/src/c++/libhdfs/docs ./hdfs/src/docs ./mapreduce/src/docs Cheers On Sat, Jun 25, 2011 at 8:17 AM, Marcos Ortiz <[EMAIL PROTECTED]> wrote: > OK, Owen, Where I can find the sources of the docs? > Which is the format for the docs? DocBook, ReST, etc? > > El 6/25/2011 4:38 AM, Owen O'Malley escribió: > > On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortiz<[EMAIL PROTECTED]> wrote: >> >> >> >>> Regards to all the list. >>> I´m looking for a proper way to work on the internationalization of >>> Hadoop, >>> but I don´t know if this is a good project >>> or if this is useful for the community, at least, I think that it would >>> be >>> very useful for many people that want to see the messages >>> of the project in another language, for example, in Spanish. >>> >>> >> >> I think it would be very useful, but very time consuming project. I'd >> suggest that you start by translating the documentation for the release >> into >> another language first. >> >> -- Owen >> >> >> > > -- > Marcos Luís Ortíz Valmaseda > Software Engineer (UCI) > http://marcosluis2186.**posterous.com<http://marcosluis2186.posterous.com>> http://twitter.com/**marcosluis2186 < http://twitter.com/marcosluis2186>> >
+
Ted Yu 2011-06-25, 16:08
-
Re: Hadoop´s Internationalization
Eric Baldeschwieler 2011-06-30, 16:06
Do other apache projects have a good localization framework for error messages?
I'd think there would be interest in starting this, although since we'd need to add the framework, this would be an investment.
On Jun 25, 2011, at 1:38 AM, Owen O'Malley wrote:
> On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote: > >> Regards to all the list. >> I´m looking for a proper way to work on the internationalization of Hadoop, >> but I don´t know if this is a good project >> or if this is useful for the community, at least, I think that it would be >> very useful for many people that want to see the messages >> of the project in another language, for example, in Spanish. > > > I think it would be very useful, but very time consuming project. I'd > suggest that you start by translating the documentation for the release into > another language first. > > -- Owen
+
Eric Baldeschwieler 2011-06-30, 16:06
-
Re: Hadoop´s Internationalization
Jakob Homan 2011-06-30, 21:26
I'd be hesitant about this given our experience with the Chinese version of the Hadoop documents. Picking the Chinese version of hdfs_quota_admin_guide.xml, as an example, shows that it has not actually been updated (beyond svn moves and copyright years) since its original commit back on 12/10/08 (svn rev: 736174). I'm actually going to open a JIRA to remove it since it's now woefully out of data (and even worse than no documentation is wrong documentation). We'd need to make sure that there would be contributors willing and able to keep these documents up-to-date, which hasn't happened with the Chinese, even though we have quite a few Chinese-speaking contributors. -jakob
On Thu, Jun 30, 2011 at 9:06 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote: > Do other apache projects have a good localization framework for error messages? > > I'd think there would be interest in starting this, although since we'd need to add the framework, this would be an investment. > > On Jun 25, 2011, at 1:38 AM, Owen O'Malley wrote: > >> On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote: >> >>> Regards to all the list. >>> I´m looking for a proper way to work on the internationalization of Hadoop, >>> but I don´t know if this is a good project >>> or if this is useful for the community, at least, I think that it would be >>> very useful for many people that want to see the messages >>> of the project in another language, for example, in Spanish. >> >> >> I think it would be very useful, but very time consuming project. I'd >> suggest that you start by translating the documentation for the release into >> another language first. >> >> -- Owen > >
+
Jakob Homan 2011-06-30, 21:26
-
Re: Hadoop´s Internationalization
Harsh J 2011-07-01, 05:10
Having spent some time in KDE before, proper intl. could only be achieved if string freeze dates are set and notifications are pushed out to translator teams for a translating period of newly exported strings (it could be an ongoing process, but before a release may be cut there ought to be a string freeze period). Where it can't be done, English is fallen back upon…
But KDE had large amounts of simple translation, while Hadoop has more of paragraphs and documents that need translation (keeping API docs and error messages aside). A different approach could be thought with that in mind I think.
On Fri, Jul 1, 2011 at 2:56 AM, Jakob Homan <[EMAIL PROTECTED]> wrote: > I'd be hesitant about this given our experience with the Chinese > version of the Hadoop documents. Picking the Chinese version of > hdfs_quota_admin_guide.xml, as an example, shows that it has not > actually been updated (beyond svn moves and copyright years) since its > original commit back on 12/10/08 (svn rev: 736174). I'm actually > going to open a JIRA to remove it since it's now woefully out of data > (and even worse than no documentation is wrong documentation). We'd > need to make sure that there would be contributors willing and able to > keep these documents up-to-date, which hasn't happened with the > Chinese, even though we have quite a few Chinese-speaking > contributors. > -jakob > > On Thu, Jun 30, 2011 at 9:06 AM, Eric Baldeschwieler > <[EMAIL PROTECTED]> wrote: >> Do other apache projects have a good localization framework for error messages? >> >> I'd think there would be interest in starting this, although since we'd need to add the framework, this would be an investment. >> >> On Jun 25, 2011, at 1:38 AM, Owen O'Malley wrote: >> >>> On Fri, Jun 24, 2011 at 7:55 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote: >>> >>>> Regards to all the list. >>>> I´m looking for a proper way to work on the internationalization of Hadoop, >>>> but I don´t know if this is a good project >>>> or if this is useful for the community, at least, I think that it would be >>>> very useful for many people that want to see the messages >>>> of the project in another language, for example, in Spanish. >>> >>> >>> I think it would be very useful, but very time consuming project. I'd >>> suggest that you start by translating the documentation for the release into >>> another language first. >>> >>> -- Owen >> >> >
-- Harsh J
+
Harsh J 2011-07-01, 05:10
-
Re: Hadoop´s Internationalization
Owen O'Malley 2011-07-01, 05:44
On Thu, Jun 30, 2011 at 9:06 AM, Eric Baldeschwieler <[EMAIL PROTECTED]>wrote:
> Do other apache projects have a good localization framework for error > messages? >
Java has very good localization capabilities. However, it is a huge pervasive change if we want to get each and every user-facing string localizable.
-- Owen
+
Owen O'Malley 2011-07-01, 05:44
-
Re: Hadoop´s Internationalization
Steve Loughran 2011-07-01, 15:38
On 01/07/2011 06:44, Owen O'Malley wrote: > On Thu, Jun 30, 2011 at 9:06 AM, Eric Baldeschwieler > <[EMAIL PROTECTED]>wrote: > >> Do other apache projects have a good localization framework for error >> messages? >> > > Java has very good localization capabilities. However, it is a huge > pervasive change if we want to get each and every user-facing string > localizable. > > -- Owen > Let's be precise: Internationalisation (note the spelling) is a maintenance mess too. It's not so much a "one off event" as something you have to do every time anyone adds an error message, or you gradually let the percentage of i18n'd messages drop over time. Given a limitation of Hadoop now is that when you get near the fringes of the valid configuration space the messages aren't that helpful, I'd focus on those.
I say "I" literally here, as it tends to me that hits these problems.
In a concession to the US installed base, I will spell words like "datacentre" and "normalised" incorrectly for EN_GB. This is not just politeness, it's self interest: I added a message to Ant about an unknown task that said "your task is spelt wrong", and we kept on getting bugreps saying "you have spelled spelled wrong" that I'd close as "workforme, you can't spell the past tense of spelled correctly", until I got bored and changed it to a present-tense form that was valid everywhere.
I18n getting started docs are good, and examples, but error messages may be best left as is. One possibility though is to add a unique error code to each one that could be indexed in each document, wiki, etc.
-steve
+
Steve Loughran 2011-07-01, 15:38
-
Re: Hadoop´s Internationalization
Eric Baldeschwieler 2011-07-01, 16:23
Unique error codes sounds like a very good place to start!
Freezing the code while a hypothetical group of volunteers localizes sounds premature to me.
Are there good examples of projects with lazy localization that work well? EG The project is released in english, but there are sub-trees where additional localized content can be populated based on index numbers and such?
E14
On Jul 1, 2011, at 8:38 AM, Steve Loughran wrote:
> On 01/07/2011 06:44, Owen O'Malley wrote: >> On Thu, Jun 30, 2011 at 9:06 AM, Eric Baldeschwieler >> <[EMAIL PROTECTED]>wrote: >> >>> Do other apache projects have a good localization framework for error >>> messages? >>> >> >> Java has very good localization capabilities. However, it is a huge >> pervasive change if we want to get each and every user-facing string >> localizable. >> >> -- Owen >> > Let's be precise: Internationalisation (note the spelling) is a > maintenance mess too. It's not so much a "one off event" as something > you have to do every time anyone adds an error message, or you gradually > let the percentage of i18n'd messages drop over time. Given a limitation > of Hadoop now is that when you get near the fringes of the valid > configuration space the messages aren't that helpful, I'd focus on those. > > I say "I" literally here, as it tends to me that hits these problems. > > In a concession to the US installed base, I will spell words like > "datacentre" and "normalised" incorrectly for EN_GB. This is not just > politeness, it's self interest: I added a message to Ant about an > unknown task that said "your task is spelt wrong", and we kept on > getting bugreps saying "you have spelled spelled wrong" that I'd close > as "workforme, you can't spell the past tense of spelled correctly", > until I got bored and changed it to a present-tense form that was valid > everywhere. > > I18n getting started docs are good, and examples, but error messages may > be best left as is. One possibility though is to add a unique error code > to each one that could be indexed in each document, wiki, etc. > > -steve > > >
+
Eric Baldeschwieler 2011-07-01, 16:23
-
Re: Hadoop´s Internationalization
Steve Loughran 2011-07-02, 14:23
On 01/07/2011 17:23, Eric Baldeschwieler wrote: > Unique error codes sounds like a very good place to start! we could call them "URLs" and have things at the end of them http://wiki.apache.org/hadoop/ConnectionRefused
+
Steve Loughran 2011-07-02, 14:23
-
Re: Hadoop´s Internationalization
Allen Wittenauer 2011-07-14, 21:04
On Jul 2, 2011, at 7:23 AM, Steve Loughran wrote: > On 01/07/2011 17:23, Eric Baldeschwieler wrote: >> Unique error codes sounds like a very good place to start! > > > we could call them "URLs" and have things at the end of them > http://wiki.apache.org/hadoop/ConnectionRefused This is how Sun (I refuse to acknowledge that "other company") built SMF. It would give you a short error, but provide a URL that would point to sun.com for more info. (i.e. http://sun.com/msg/SMF-8000-05 , now locked behind Larry's boat money generator). In the Hadoop case, we would likely want to provide a base set that folks could install and modify locally. For example, what may be an XYZ error generically may be more indicative of a problem with our local SerDe's. Putting it on the wiki would be a good thing, yes, but I worry about local tribal knowledge spilling into it. I dunno. Just thinking out loud.
+
Allen Wittenauer 2011-07-14, 21:04
-
Re: Hadoop´s Internationalization
Steve Loughran 2011-07-19, 09:33
On 14/07/11 22:04, Allen Wittenauer wrote: > > On Jul 2, 2011, at 7:23 AM, Steve Loughran wrote: > >> On 01/07/2011 17:23, Eric Baldeschwieler wrote: >>> Unique error codes sounds like a very good place to start! >> >> >> we could call them "URLs" and have things at the end of them >> http://wiki.apache.org/hadoop/ConnectionRefused> > > This is how Sun (I refuse to acknowledge that "other company") built SMF. It would give you a short error, but provide a URL that would point to sun.com for more info. (i.e. http://sun.com/msg/SMF-8000-05 , now locked behind Larry's boat money generator). yeah, changing URLS is one of the weaknesses of that design. Oracle managed to break all the javadoc links to JDK docs > > In the Hadoop case, we would likely want to provide a base set that folks could install and modify locally. For example, what may be an XYZ error generically may be more indicative of a problem with our local SerDe's. Putting it on the wiki would be a good thing, yes, but I worry about local tribal knowledge spilling into it. It's strength is it stays up to date. I've been proposing adding this to connection operations https://issues.apache.org/jira/browse/HADOOP-7469
+
Steve Loughran 2011-07-19, 09:33
-
Re: Hadoop´s Internationalization
Luke Lu 2011-07-01, 16:27
On Fri, Jul 1, 2011 at 8:38 AM, Steve Loughran <[EMAIL PROTECTED]> wrote: > On 01/07/2011 06:44, Owen O'Malley wrote: >> >> On Thu, Jun 30, 2011 at 9:06 AM, Eric Baldeschwieler >> <[EMAIL PROTECTED]>wrote: >> >>> Do other apache projects have a good localization framework for error >>> messages? >>> >> >> Java has very good localization capabilities. However, it is a huge >> pervasive change if we want to get each and every user-facing string >> localizable.
The pervasive change (using ResourceBundles or whatever to do the mapping from a canonical string to whatever) is done only once. i18n/l10n can then be improved incrementally by updating properties (or whatever) files.
__Luke
+
Luke Lu 2011-07-01, 16:27
|
|