Finding broken links using Http WebRequest / Http WebResponse in C#

Status Code in the response will be used for finding whether the link is broken or not. But normally exception will be thrown if the link is broken. So the Timeout property of webrequest plays important role here.

(i-e) If we specify more timeout value, then total execution will take more time. If we specify less timout then there may a possiblity of declaring a valid link as a broken link. If anyone knows how to handle it appropriately, you can mention it in the comments.



private bool isBrokenLink(string url)
{

Boolean isBrokenLink = false;

try
{

WebRequest http = HttpWebRequest.Create(url);
http.Timeout = 5000;
HttpWebResponse httpresponse = (HttpWebResponse)http.GetResponse();

if (httpresponse.StatusCode == HttpStatusCode.OK)
{
isBrokenLink = false;
}
else
{
isBrokenLink = true;
}


}
catch (Exception ex)
{
isBrokenLink = true;

}
return isBrokenLink;

}

Making below two changes in the above code may increase the performance.

HttpWebRequest http = (HttpWebRequest) WebRequest.Create(url);
http.UserAgent = "Mozilla/9.0
              (compatible; MSIE 6.0; Windows 98)";
http.Method = "HEAD";

Actually the HEAD method will allow verifying the link without downloading entire content. So the performance will be increased. Particularly, it will improve the performance significantly when verifying the missing images.

If you have better solution, just tell me !

0 comments:

Post a Comment