Pages

Sunday, June 23, 2013

What is Hadoop and how it works?

Hadoop has gaining interest in the contemporary world. Hadoop is a framework which is useful to process huge data, technically big data.

Here is presentation explaining basics of Hadoop. 



Drop comment or mail for any queries and suggestions.

Thursday, June 6, 2013

How to install and uninstall tomcat on Ubuntu



Installation
$ sudo apt-cache search tomcat
$ sudo apt-get install tomcat7

       You can directly use second command. just to check what are the available tomcat version , you can use the first command.
Start
$ sudo /etc/init.d/tomcat7 start
Stop
$ sudo /etc/init.d/tomcat7 stop
Uninstall
$ dpkg -l | grep tomcat
           grep key word used to search the pattern
$ sudo dpkg -P tomcat7
           by using sudo we are leveraging the rights of user to install or uninstall.

Sunday, June 2, 2013

Cloud Computing - Basics

New to Cloud?
Check the simple presentation that gives you the basic idea about this cloud computing.
The presentation is self explainable.


you can give/ask feedback/query below in comments. :)

Monday, May 20, 2013

BIG DATA - the data ocean

Big data, the name itself suggests that it is kind of data which is huge/big. We need to know which data makes to that big/huge level.

#What is Big Data ?
According to IDC ( International Data Corporation ), we define the data as big data based on three parameters viz. Volume, Velocity and Variety.
Volume refers to size of the data. Eg : Millions of tweets posted for every minute
Velocity refers to how fast you need analysis on that data. Eg : In IPL (Cricket) match you need to check which team has more tweets and to display that information, it need to query based on (#CSK) tag or any team tag.
Variety refers to what type of data you have, text/image/audio/sensorinformation/video or any mixture of those. Eg : Facebook is processing your status message(text),photo ( image) and video formats. The data may be structured or unstructured.

According SAS ( another company ) variability and complexity are the two key terms.
Variability refers to how variable that data size and the time at which we are getting that data, it is more like trending data.
Complexity refers to the same as variety term defined in IDC terms. Data is coming from multiple sources in multiple format and we need to process relevant information. Not just blindly joining every information.

#Where is it used?
The organisations which are having the data that has previous properties like Google,Facebook,Twitter etc. We can't decide the big data based on the size of the data. 1 tera byte is a huge amount of data for one organisation, but 1 gb data is a huge amount for another organisation.

#How is it used?
Only defining big data is not worthy. It is defined to do some analytics over that huge amount of data and come out with new information and project new ideas in developing organisation.
To do this, Big data combines with Hadoop (will be posted soon ) framework and yields results.
"Small data is gone. Data is just going to get bigger and bigger and bigger, and people just have to think differently about how they manage it."

you can give/ask feedback/query below.