Reduce the size of the link (URL)

advertisements

Is it possible to reduce the size of a link (in text form) by PHP or JS?

E.g. I might have links like these:

http://www.example.com/index.html                     <- Redirects to the root
http://www.example.com/folder1/page.html?start=true   <- Redirects to page.html
http://www.example.com/folder1/page.html?start=false  <- Redirects to page.html?start=false

The purpose is to find out, if the link can be shortened and still point to the same location. In these examples the first two links can be reduces, because the first points to the root, and the second has parameters that can be omitted.
The third link is then the case, where the parameters can't be omitted, meaning that it can't be reduced further than to remove the http://.

So the above links would be reduced like this:

Before: http://www.example.com/index.html
After:  www.example.com

Before: http://www.example.com/folder1/page.html?start=true
After:  www.example.com/folder1/page.html

Before: http://www.example.com/folder1/page.html?start=false
After:  www.example.com/folder1/page.html?start=false

Is this possible by PHP or JS?

Note:

www.example.com is not a domain I own or have access to besides through the URL. The links are potentially unknown, and I'm looking for something like an automatic link shortener that can work by getting the URL and nothing else.

Actually I was thinking of something like a linkchecker that could check if the link works before and after the automatic trim, and if it doesn't then the check will be done again at a less trimmed version of the link. But that seemed like overkill...


Since you want to do this automatically, and you don't know how the parameters change the behaviour, you will have to do this by trial and error: Try to remove parts from an URL, and see if the server responds with a different page.

In the simplest case this could work somehow like this:

<?php
    $originalUrl = "http://stackoverflow.com/questions/14135342/reduce-link-url-size";

    $originalContent = file_get_contents($originalUrl);

    $trimmedUrl = $originalUrl;

    while($trimmedUrl) {
        $trialUrl = dirname($trimmedUrl);
        $trialContent = file_get_contents($trialUrl);
        if ($trialContent == $originalContent) {
            $trimmedUrl = $trialUrl;
        } else {
            break;
        }
    }

    echo "Shortest equivalent URL: " . $trimmedUrl;
    // output: Shortest equivalent URL: http://stackoverflow.com/questions/14135342
?>

For your usage scenario, your code would be a bit more complicated, as you would have to test for each parameter in turn to see if it is necessary. For a starting point, see the parse_url() and parse_str() functions.

A word of caution: this code is very slow, as it will perform lots of queries to every URL you want to shorten. Also, it will likely fail to shorten many URLs because the server might include stuff like timestamps in the response. This makes the problem very hard, and that's the reason why companies like google have many engineers that think about stuff like this :).