Using PHP/Curl to make big Google Charts

Google charts is a wonderful tool for generating all kinds of graphs and charts. It provides two methods for supplying data, either in the URL or via POST.
The URL version is limited to 2048 characters and the POST version is limited to 16K characters. I wish the FAQ explained a little more about why the limitation exists.

I’ve run into two problems using it in my applications. SSL applications complain if any non-ssl content is pulled into the page and occasionally, I have tables too large for the URL api but that really is the only way to retrieve the image in my application.

I’ve got a simple php-curl based proxy which solves both issues. https://github.com/derak-kilgo/google-chart-proxy Its a drop in replacement for the google’s chart URL.

http://chart.apis.google.com/chart?chs=350x225&cht=p3&chd=s:Mx&chdl=Charts+Users|Should+Use+Charts&chl=Users|Don't+Use&chtt=Google+Charts&chts=676767,20

becomes

http://your-domain.com/proxy.php?chs=350x225&cht=p3&chd=s:Mx&chdl=Charts+Users|Should+Use+Charts&chl=Users|Don't+Use&chtt=Google+Charts&chts=676767,20

Its just that simple.

O to Dev in 10 seconds with Ubuntu

The end result is to take a based install of ubuntu 10.x desktop and make it ready for php web development.

These commands download about 500MB of software.

Run it line by line as root or make a bash script out of it.


#!/bin/bash

#All of these steps must be done as root.
if [ "$(whoami)" != 'root' ]; then
echo "This script must be run as root."
exit 1;
fi

#Add the zend repo to apt.
echo "deb http://repos.zend.com/zend-server/deb server non-free" >> /etc/apt/sources.list

#Add zend's signing key to the apt key ring so we can use the zend repo.
wget http://repos.zend.com/deb/zend.key -O- | sudo apt-key add -

#Add yogarine's repo so we can download the latest version of eclipse and the php development tools.
add-apt-repository ppa:yogarine/eclipse/ubuntu

# Update your repo cached software list.
apt-get update

#Install eclipse with php development tools (latest), zend server (apache, php, and php control panel), mysql (cli client and server) in a single command.
apt-get install eclipse-pdt zend-server-ce-php-5.3 php-5.3-extra-extensions-zend-server-ce mysql-server mysql-client phpmyadmin

Post install Tasks:

Reboot your computer.
Eclipse will install openJDK and you must restart to complete the installation.

Visit http://127.0.0.1:10081 to complete the setup of your zend server control panel.

Setup your document root.
I usually make a directory in var for my workspace and point eclipse to that location like so.

cp /var/www/*.php ~/workspace/
sudo rm -f -r /var/www
#Replace $USER with your login name.
sudo ln -s -v ~/home/$USER/workspace/ /var/www

To access the debugger from PDT, add the following get variables to your request:


http://localhost/test/info.php?debug_host=127.0.0.1%2C127.0.0.1&start_debug=1&debug_port=10000&original_url=http%3A%2F%2Flocalhost%2Ftest%2Finfo.php&send_sess_end=1&debug_stop=1&debug_start_session=1&debug_no_cache=1310991085348&debug_session_id=1000

Simple RSS Reader Examples

This sketch is for two really simple RSS readers.
One with php and one with javascript.

RSS or Really Simple Syndication is an text based format for publishing news and information. Being text makes it easy to manipulate with your language of choice. I like php and javascript so I’m using php and javascript for this example.

Download my code for these examples

Files included in this example

Files included in this example

First lets look at the php example:

If you open reader.php, you’ll see two functions:

One is a php environment test that checks out your php instance to make sure the functions  using are turned on.
Some hosting companies disable some of these extensions. If you don’t have direct control over the server environment, its always good to do a little probing to make sure things are going to work as expected. It also has a constant at the top  of the page which will disable the test. Once you’ve run this code on your server successfully, there is no need to waste CPU cycles on this test with every page load.

The second function is the rss reader. Lets looks at the feed data so we can understand what is going on here.

Example RSS2.0 feed data:

<xml>
<rss>

<channel>
<item>
<title>Welcome to my blog</title>
<link>http://127.0.0.1/my-blog</link>
<description>A simple example from my blog</description>
</item>

<channel>
</rss>

With SimpleXML, if I want the title from the first post your code would look something like this:

$strData = file_get_contents(‘http://www.127.0.0.1/rss);
$oXml = simple_xml_load_string($strData);
$title = $oXml->channel->item[0]->title;

PHP allows urls to be loaded with any of the file functions as if they are a file on the local filesystem. As such, I can use file_get_contents() to load the data from the url into the string. (This depends on PHP’s ‘allow_url_fopen’  setting, which must be enabled.)

SimpleXML is doing all the hard work. Its going to turn that string of xml in to a set of nested objects which will allow us to programmaticly get at the data in the xml document without having to build a complicated custom string parser.

In the example, all operations with SimpleXML are enclosed in a try-catch block. This is because SimpleXML will throw exceptions the XML is not formed correctly(missing open or close tag, including reserve characters where there shouldn’t be any, ect…) , or if you try to access an element that does not exist.

You’ll also notice from the example that third parameter of ‘simplexml_load_string’. By giving the function ‘LIBXML_NOCDATA’, SimpleXML will automatically convert any <![CDATA] blocks it encounters into strings. This sort of block is used to safely enclose characters that would otherwise be reserved by the xml format without breaking the XML format. Without that option, our feed will return an empty SimpleXML object every time it encounters a <![CDATA] block.

That is all you need to read XML with PHP.

One note: Most servers send XML as ‘UTF-8’ instead of simple ASCII text. If your feed reads some UTF-8 text and tries to  display it as ASCII text without properly converting it first, you may notice artifacts in the text that don’t make sense. These kinds of issues can be fixed with PHP’s multi-byte string functions at the expense of added complexity to the feed reader code.

This is not ‘production’ code.

The php example is not really production code. The PHP process is blocked while waiting for the feed to be read. It won’t continue loading the page until the feed is loaded or the connection between your server and the feed server times out.  A workaround to this is to load the feed and cache it on the server as a file or in memory with a tool like memcache. In most cases this scales well and mitigates the dependency on the other site but does not eliminate the problem.

Reading feeds with Javascript:

A javascript example from scratch would be a lot more complicated that this php example so I’m using Jean-Francois Hovinne’s jQuery plug-in, jFeed.

This implementation has the added bonus of dealing well with other character sets like UTF-8.
The files included in the example code:

  • jquery.js – The core of the jQuery javascript library
  • jquery.jfeed.pack.js – A mini-fied version of the jFeed plug-in
  • reader.js – A simple reader which works just like the php reader, except with javascript.
  • proxy.php – Built on Jean’s example with an extra security check.

In this case, the page will load completely before the jFeed library tries to load the feed. jFeed will use a XMLHttpRequest to load the feed data and jquery’s DOM function to parse the data giving us access to the data elements similar to SimpleXML in php.

Once the feed is loaded, it then uses jQuery’s DOM function to add the content to the page. The user will notice some lag between the page loading and the feed loading, but the feed won’t slow down the rest of the page as it did with the php example. Caching could also be used here with a quick php script to speed things up a bit.

Javascript does have limitations. You won’t be able to read a feed from another domain, which is why Jean-Francois included the ‘proxy.php’ script. This script loads the feed and presents it to the javascript as if it came from the same domain. He says, “don’t use this in production” and I would agree. I made on little change to his script, adding a check to make sure the request is coming from local host and not some external site using us as generic proxy which should make it a little safer to use. Having something like this on your server also opens your domain to XSS attacks.

Javascript Example Code

To read the feed, include the required javascript file in your documents head (which are included in the sample code):

<html>
<head>

<script type=”text/javascript” src=”jquery/jquery.js”></script>
<script type=”text/javascript” src=”jquery/jquery.jfeed.pack.js”></script>
<script type=”text/javascript” src=”jquery/reader.js”></script>

Add a div tag place holder for where you would like your rss feed to appear

<div id=”feedReplace”></div>

Add this just above the closing body tag.

<script type=”text/javascript”>getFeed(‘#feedReplace’, ‘http://yoururl.com/rssfeed’,5);</script>

When the page loads, the javascript will load the rss, parse it, format it and append it into the page at the location specified. You can also specify how many items it should show. The example above shows the top 5 items.


Which approach is better?

If the feed data is a featured item of your site it may make more sense to use php because you can be sure the content will load. If the feed is just extra on a larger site, javascript makes a lot of sense because it doesn’t bog down the site while feed is loading.

Send a file via POST with cURL and PHP

cURL is a great library. It can do just about anything that a normal web browser can do including send a file via a post request.

This makes it really easy to transmit files between computers. In my case, I was looking for an easy way to send images snapped by various webcam systems to a central server with php managing the images.

Here is a simple script to send a file with php/cURL via POST:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<?php
	$target_url = 'http://127.0.0.1/accept.php';
        //This needs to be the full path to the file you want to send.
	$file_name_with_full_path = realpath('./sample.jpeg');
        /* curl will accept an array here too.
         * Many examples I found showed a url-encoded string instead.
         * Take note that the 'key' in the array will be the key that shows up in the
         * $_FILES array of the accept script. and the at sign '@' is required before the
         * file name.
         */
	$post = array('extra_info' => '123456','file_contents'=>'@'.$file_name_with_full_path);
 
        $ch = curl_init();
	curl_setopt($ch, CURLOPT_URL,$target_url);
	curl_setopt($ch, CURLOPT_POST,1);
	curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
	$result=curl_exec ($ch);
	curl_close ($ch);
	echo $result;
?>

And here is the corresponding script to accept the file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<?php
$uploaddir = realpath('./') . '/';
$uploadfile = $uploaddir . basename($_FILES['file_contents']['name']);
echo '<pre>';
	if (move_uploaded_file($_FILES['file_contents']['tmp_name'], $uploadfile)) {
	    echo "File is valid, and was successfully uploaded.\n";
	} else {
	    echo "Possible file upload attack!\n";
	}
	echo 'Here is some more debugging info:';
	print_r($_FILES);
	echo "\n<hr />\n";
	print_r($_POST);
print "</pr" . "e>\n";
?>

And that's it.
Navigate to the 'send' script and it will transmit the file sample.jpeg to the accept script.

Note that you can include other arguments in the post along with the file. This allows you to authenticate the upload. I'm using pre-shared strings to 'validate' that upload came from my send script.

This works with the command line version of php too.

 

UPDATE Oct. 2014 – 

If you're using PHP 5.5 or better, check out the recently added 'CurlFile' class which makes the whole process a lot easier.

 

References:
http://us3.php.net/manual/en/function.move-uploaded-file.php
http://us3.php.net/manual/en/features.file-upload.post-method.php
http://curl.haxx.se/libcurl/php/examples/multipartpost.html
http://forums.devshed.com/php-development-5/php-curl-send-a-file-533233.html

Open Source Content Filter

When you’ve got lots of young internet users, a filter is the best way to allow access while keeping alot of the questionable content out. Such systems are expensive and difficult to setup and administer.

dansguardian aims to change that. This open source content filter and web proxy is quite effective at filtering questionable content and even ads. It can even be setup to use external anti-virus programs to scan content as its being accessed.

How does a content filter work?

The filter gets in the middle of the conversation between you and the web server.

Web proxy/filter diagram

Web proxy/filter diagram

Your browser asks the proxy/filter server for a website. The proxy server scans the request and the response for questionable content and viruses. If everything is clean, the content is returned to your browser from the proxy. If there is a problem with the content, then it is blocked.

Zero to filter in 10 minutes flat:

Assumptions: You have access to an ubuntu server and said server has access to the internet.

  1. Open a command prompt and type:
    sudo apt-get install tinyproxy dansguardian
    This will install tinyproxy, a web proxy server and dansguardian – a content filtering system.
    Ubuntu will also recommend ‘ClamAv’. Accept the defaults and install.
  2. Configure dansguardian.
    Edit the /etc/dansguardian/dansguardian.conf file
  3. Place a pound sign in front of the line with the word ‘UNCONFIGURED’
  4. Remove the pound sign in front of the line that starts with:
    contentscanner = ‘/etc/dansguardian/contentscanners/clamav.conf’
    This will enable clam av scanning of content.
  5. Next edit the conf file for tiny proxy located here:
    /etc/tinyproxy/tinyproxy.conf
  6. Around line 15, You should see a line ‘Port=8888‘. Change that to ‘Port=3128
  7. Start it up. You’ll need to start the proxy first, then the filter.
    sudo /etc/init.d/tinyproxy start
    sudo /etc/init.d/dansguardian start
  8. Configure you client computers to use the proxy.
    In firefox for example, go to Tools->Options->Advanced->Network-Tab
    Click on the ‘settings’ button.
    Click on the ‘Manual proxy settings’
    in the HTTP proxy settings, enter the address of your proxy server. In the port box, enter 8080.
  9. In your internet router, block access to the internet from all addresses except the proxy server.

Done!

Gotchas:

  • If the firewall on the proxy server is off or allowing direct connections to the proxy server, your filter can be bypassed by connecting to port 3128. Make sure only localhost can connect to this port.
  • Anyone with SSH access can subvert your proxy. Using port-forwarding and connecting directly to the proxy on port 3128, your proxy can be bypassed.
  • If the firewall on the proxy server is not allowing connections to port 8080, then no one will be able to use your new content filter.
  • Dans guardian has a perl gui, but mod perl is disabled on my server. I wrote a quick php script to replace it. You’ll need to modify your dansguardian.conf file to enable it.
  • Webmin provides  a gui for this system. If your not comfortable editing text files on a linux system, webmin is the way to go. It provides a web gui to make changes to a linux system.
  • While it is possible to install this on an ubuntu desktop, its best to do this to a computer/server with limited physical access. This makes bypassing the filter much more difficult.