A Foolish Expectation

To me, yesterday’s announcement about increase in petroleum prices was an attempt of April fool’s prank from the government. I waited the entire day to hear the confirmation about this prank. However, I turned to be a fool to expect a government (a group of serious people with the authority to govern a country) to do pranks. Despite the fact, petroleum prices are falling sharply in international market, the government decided to increase it mainly due to an unknown reason (at least to me). Being an innocent law abiding citizen, I welcome and appreciate the move of this group of serious people….

Posted in Technical | Leave a comment

Towards Urdu News Mining

Urdu News Tag CloudOver the weekend I developed a small program to collect data from three different Urdu news agencies namely Jang, Naw-e-Waqat and BBC Urdu. The program is able to collect news and extract the important keywords. The application generates Tag Cloud based on the important keywords extracted from the news headlines.  The application is currently hosted at www.newslink.pk.

A lot of work to do but seems a good beginning!!

Posted in Technical | Leave a comment

Performance Measurement of a Web Application Hosted on Different Type of Amazon EC2 Instances

Introduction:  Recently I have conducted a quick lab on Amazon Cloud for Web Applications Engineering students at AIT. We launch different type of EC2 instances supporting 32-bit OS and installed a simple Web application. Then we benchmark the performance of the EC2 instances for the sample Web application. I am documenting the details of the lab so anyone can reproduce it. 

Groups and Type of EC2 Instances : We divided students into four groups, each group launched a different type of EC2 instances. The group name and the type of EC2 instance launched are as follows:

  • Group1: EC2 micro instance
  • Group2: EC2 small instance
  • Group3: Ec2 medium instance (m1.medium)
  • Group4: Ec2 medium instance (c1.medium)

Note: You can read the specification of these instances at http://aws.amazon.com/ec2/instance-types/.

Sample Web Application: All four groups launched instances based on a private AMI that have already installed Apache web server, PHP, and a sample web application containing two pages; index.php and quad.php . The index.php page just provides a link to the quad.php that solve quadratic equation by generating random values for a, b, and c. (See the bottom of the post to download the source code for both of the page).

Workload Generation: We used httpef to generate synthetic workload for the application in increasing step-up fashion. In simple words, over the time we keep increasing the number of user sessions to the application. Each user session makes a request to the index.php and then wait for few seconds and makes a request to quad.php.  (The workload generation script db.sh, urls.txt, httperf_script.sh are also available in the lab material, see the bottom of the post. You need to fix server parameter in httperf_script.sh to the public DNS name of your instance and run db.sh script.)

Results: The following four graphs show the average response time during the experiment for each type of the instances.


EC2 micro instance performance


EC2 small instance performance


EC2 m1.medium instance performance


EC2 c1.medium instance performance

We have seen the performance of the different type of instances varies for the same application and workload. Medium instances (c1.medium and m1.medium) are very consistent in the performance. Once we see a dramatic growth in the response time then we never observe any decrease in the response time for the Medium instances. However Small instance is not as consistent as Medium instances. Micro instance performance is quite interesting. It fluctuates over the time. The main reason for the micro instance performance inconsistency is mainly due to a special CPU allocation strategy by Amazon. More detail about the Micro instance CPU scheduling is available at http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/concepts_micro_instances.html.

Conclusion: Different type of EC2 instances are available to host the Web applications however we need to identify the most appropriate that helps to reduce the cost and maintain a good performance for the application. The lab provides a hands-on experience to use EC2 instances and benchmark the applications hosted on Amazon EC2 instances.

Download Lab Material (httperf scripts and sample Web application)

Posted in Technical | Leave a comment

Add Thai Language Support on Debian OS

Debian-based system, we can isntal Thai language support by using the following command

sudo apt-get install libthai*

After we can configure and select Thai UTF 8 encoding system to enable it. using the following command:

sudo dpkg-reconfigure locales

Posted in Technical | Leave a comment

One Strategy for Writing a Research Statement

I found Will Bridewell‘s  strategy for writing research statement useful and would like to archive it here. The following article is copied from http://cll.stanford.edu/~willb/research_statement.php

One Strategy for Writing a Research Statement
If you’re in the market for an academic job, chances are that you’ll have to submit a research statement of some sort. These critters offer you the chance to distill your life’s work and dreams into a short essay that can only hurt your chances of employment. Since you probably just finished your thesis where length is a virtue writing this statement can be daunting. While examples of research papers abound, there are relatively few research statements on the web. None of those were peer reviewed. Most of them were written by fresh Ph.D.s who found themselves in exactly your position.

To help you out, I’m going to describe a fairly specific structure to follow. This may or may not fit your situation, and it may not generalize beyond computer-related research. I developed the structure out of need by analyzing statements from people who (a) have jobs and (b) work in my general field. Recently IEEE Intelligent Systems ran the feature “AI’s 10 to Watch” where each person provided a 5 to 10 paragraph research statement suited for a relatively general audience. These vary in quality and message, but more importantly they have commonalities that one can distill.

The outline below fits my own work fairly well, and it may cover your case too. If not, then you may find a better model in one of the other narratives based on the degree of impact and novelty of your own research. Additionally, you can read the previous installment from 2006.

The Outline

Define your general problem area. If it’s a well-known area, like natural language processing or machine learning, then identify what you think is the primary question and its merit—that is, take a stance. If it’s a new research domain, motivate it. If it’s really new, see the first 3 paragraphs of Levin’s statement from the IEEE article.

  • Narrow your focus to a particular aspect of the domain (e.g., scope-limited natural language processing, path planning in unknown places, role of social networks for semantic web), state your own motivation for doing so, and note the limitations. You can also describe Big Problem + Challenge + Approach at a high level. The details of this paragraph depend on how familiar your problem is. Think in terms of a subtask, but not the solution. For example, this paragraph might talk about “transfer learning as a problem in machine learning” but it would not talk about “analogy for transfer learning”, which comes next.
  • Announce your secret weapon. What do you have that others don’t? Why can you address the challenge above when others have failed or even overlooked it? This could/should be some technology that you built. Don’t be too technical here, because nobody cares if you used Markov-Chain Monte Carlo, Gibbs sampling, etc. If your research has already had an impact, mention it at the end of this paragraph (e.g., “Hundreds/Thousands/Millions of people/researchers use this all the time!”) Otherwise, mention how this tool helped you address the challenge from paragraph 2. The tone of this paragraph has to be just right. See the statements by Morency and Cimiano and compare those to Sudderth and Milch. Luis von Ahn has two “secret weapon” paragraphs, but his research statement differs from the others due to the practical focus and large impact of his work.
  • State your current focus. What’s the big unaddressed problem? Why hasn’t it been addressed? (Hopefully because your technology wasn’t around.) How will you address this problem? This should be particularly passionate, and it can help to include a bigger or more general long term goal.
  • End on a strong note. This paragraph (or the last one for longer statements) was the most variable in the examples. Sudderth’s statement paints a rich vision of the future. Other statements cover the “big question” in paragraph 4. In that case, the last paragraph reflects “my approach” and ends with a sentence that claims an expansive general impact or that broadens the focus to the level of the first paragraph. Do not end with “My domain is great/relevant!” Consider Mika’s last paragraph as a positive example.

Overall, I think that 5 paragraphs is sufficient for a starting point. You can always add more “Big Problem + Challenge + Approach” paragraphs if you’ve done two or more great things, but brevity is of great value. If the statement’s only one page, people may actually read it.

Posted in Technical | Leave a comment

Remove a folder recursively in Ubuntu

Often we need to remove any special folder or file recursively from a folder in Ubuntu/Linux, you just need to switch your terminal to folder and invoke the following command

find -name "\FOLDER_NAME or File_Name" -exec rm -rf {} \;

Lets say we need to search and remove all .”svn” folders from /home/waheed/mysvn, I will do the following steps:

    • Launch terminal (Type ALT-F2 gnome-terminal or click it from menu)
    • Switch the directory (cd /home/waheed/mysvn)
    • Type and enter the command:
find -name "\.svn" -exec rm -rf {} \;
Posted in Knowledge Base, Technical | Leave a comment