I've been assisting the Internet Archive on its Open Library project, my role being primarily to help them understand library data. It's fascinating watching non-librarians encounter library data -- so much that we take for granted isn't obvious at all to others. I'm thinking that it's time for a "Library Data for Dummies." I am seriously considering setting it up as a wiki so we can all contribute to it.
Most recently on the OL project we ran into what I like to call the "Mao" problem. It begins like this: the database uses bibliographic records from libraries and from Amazon. The Amazon data presents author names in natural order ("John Smith"), while the library records use the inverted order with the family name first ("Smith, John"). It's best for users of the service to see the names presented uniformly (the mixture is quite jarring). If you think about it for a moment, you realize that converting the natural order names to inverted order will be problematic, since there is nothing to tell you where the family name begins ("Oscar della Renta"). So the solution is to un-invert the inverted names, something that is purely mechanical.
Until you encounter Mao, Zedong -- and the thousands of other authors for whom "natural order" is family name followed by a given name. I find that Mao is the example that hits the "Aha!" button for most people. Obviously, presenting the name as "Zedong Mao" pretty much makes it unrecognizable. So what to do?
Well, I suppose it helps to NOT think like a librarian. Edward Betts, the coder on this project, came up with an ingenious idea: he compared the names in the Open Library records with names on Amazon and on Wikipedia, and has made a list of names that generally appear in family name first order with a link to the source where it was found. For famous authors or historical figures, Wikipedia contains many of the names and is good about presenting various name forms. It gives the traditional and simplified Chinese forms, and sometimes both Wade-Giles and Pinyin transliterations. It also often has the note:
Worldcat Identities pages, which are quite rich and link well to library data, since that's what they are based on. Presumably one could launch a search to either from a name heading. Has anyone tried this yet?