plagiarize

A werc handler that automates plagiarizing Wikipedia articles.

It looks for lines inside HTML documents that start with '%' and passes them to a custom handler. These lines are CSVs with the following format: %,1,Computer_Systems,Computer,png. The first value, in this case 1, represents the heading size. The second value, in this case Computer_Systems, represents the heading text. The third value, in this case Computer, represents the name of a Wikipedia article. The handler will fetch the overview section of the Wikipedia article and place it after the heading. The last value, png, is an image extension. If supplied, an image will be inserted before the text named [value 2] .[value 4].

Handler

fn plagiarize {
    echo '<div class="article">'
    echo '<h'`{echo $i | cut -d , -f 2}'>'
    echo $i | cut -d , -f 3 | sed -i 's/_/ /g'
    echo '</h'`{echo $i | cut -d , -f 2}'>'
    if(test ! -z `{echo $i | cut -d , -f 5}) echo '<a href="img/'`{echo $i | cut -d , -f 3}'.'`{echo $i | cut -d , -f 5}'"><img class="normimg" src="img/resized_'`{echo $i | cut -d , -f 3}'.'`{echo $i | cut -d , -f 5}'" alt="'`{echo $i | cut -d , -f 3}'" /></a>'
    if(test ! -z `{echo $i | cut -d , -f 4}) curl http://en.wikipedia.org/wiki/`{echo $i | cut -d , -f 4} | head -n `{curl http://en.wikipedia.org/wiki/`{echo $i | cut -d , -f 4} | grep -n \"toc\" | cut -d : -f 1} | grep '^<p>'  | head -n -1 | sed -i 's/<a /<span /g' | sed -i 's/<\/a>/<\/span>/g' | sed -i 's/<sup /<sup style="display:none" /g' | sed -i 's/<p>In development<\/p>//g'
    echo '</div>'
}

for(i in `{run_handler $handler_body_main}) {
    if(test ! -z `{echo $i | grep '^%'}) plagiarize
    if not echo $i
}

Demo

http://werc.kotori.me/

License

Distributed under the public domain as well as the MIT and ISC licenses.

Note

A recent Wikipedia change broke the handler. You can find a static version of the demo site in a working state at http://werc2.kotori.me/.