Tuesday, November 08, 2011

"Speaking Mandarin Chinese" in Hollywood

It's quite often to see some scenes in either TV shows or movies that the characters speak something they claim to be "Mandarin Chinese". Some characters even claim to be very fluent in it. However, for a native Traditional Chinese speaker from Taiwan like me, most of the time, those so called Chinese on screen can hardly be understandable, if it could be understood at all.

It is quite strange, since there should be lots of native Chinese speakers near the production locations of these shows or movie. Is it that hard to find a decent language consultant to make sure the proper pronunciation of a few lines? Or the Hollywood just too proud to admit the fact that, they can't do it right when they are so self-centered and so used to laugh at those whom didn't speak proper English? It's quite painful to hear a character you love to speak something that has nothing resemble to what they claim to be, if that "thing" could be called a language at all.


Read more ...

Saturday, November 05, 2011

Grabbing the vanity card of TBBT into an image

The producer of the TV show "The Big Bang Theory", Mr. Chuck Lorre, always shows the vanity card in the end of each episode. He also posts the same cards on his own website along with those for other shows he produced.

Recently, for some reason, I would like to attach as an image in a e-mail the vanity card for a specific episode of the show from the website. I prefer the image to only contain the content of the card rather than the whole page. This, of course, could be done with screen capturing and cropping of the image using something like GIMP or ImageMagick. However, since I'm a lazy guy, and the chance that I will do this more than once is quite high, manually screen capturing and cropping is certainly not an option for me. Fortunately, I have some ideas on how to do this automatically.


To grab the web page into an image on command line, there are lots of possible ways to do this. The weapon of choice is the still-buggy-but-quite-useful wkhtmltoimage from the project wkhtmltopdf. wkhtmltoimage uses WebKit and Qt to render a given page directly into an image. The great thing about this tool is that, it supports CSS and JavaScript from the page, while you can replace the CSS with your own version and can also append some JavaScripts before rendering happens.

At first, I was trying to render the page into an image, and then pass the image into ImageMatick's convert to cut out only the block of the "vanity card" in the page. However, this approach was proven to be problematic, since it is hard to automatically determine the cropping parameters needed for the "-crop" option of convert. After inspecting the HTML and CSS sources of the page, I decided to experiment with the "visibility" attribute in the CSS definition. I downloaded the CSS file, set the "visibility" attribute to "hidden" for the top most selector (the "#container" selector block in this case), turned on the visibility only for the "#content" block, and supplied the customized CSS to wkhtmltoimage. This gave me an rendered image that only shows the "card" block in the center of a white background. The white "border" then can be easily removed using the "-trim" option of convert.

Although the downloading-and-modifying-CSS approach was a success, supplying a whole modified CSS to wkhtmltoimage is not elegant and could have some potential side-effects. Therefore, the better approach is taking advantage of the ability for wkhtmltoimage to run JavaScripts to alter the "visibility" attribute for appropriate selectors after the page is done loading. Here is my final "one-liner" solution to my problem:


$ wkhtmltoimage \
--run-script "document.getElementById('container').style.visibility='hidden';" \
--run-script "document.getElementById('content').style.visibility='visible';" \
http://chucklorre.com/index-bbt.php?p=364 - \
| convert - -trim tbbt.jpg

The generated JPEG image, "tbbt.jpg", only contains the "card" I want.

The principle behind this could also be applied to other pages. I, as usual, wrote a script to save me some typing that can take an optional production number argument to grab the card for an specific episode. However, since it is an very simple script, I won't bother to post the code here...

Read more ...