Category Archives: learn
Read CSV file – Python
import pandas as pd import os os.getcwd() tendulkar = pd.read_csv("/dbfs/FileStore/tables/tendulkar.csv", header='infer') tendulkar.shape #os.listdir('dbfs/FileStore/tables/')
Read CSV file – Pyspark
from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext conf = SparkConf().setAppName("CDI Transform") sc = SparkContext.getOrCreate() sqlContext=SQLContext(sc) tendulkar1= (sqlContext .read.format("com.databricks.spark.csv") .options(delimiter=',', header='true', inferschema='true') .load("/FileStore/tables/tendulkar.csv"))
Data frame Shape -Python
Data frame Shape -Pyspark
tendulkar1.count() len(tendulkar1.columns) def dfShape(df): return(df.count(),len(df.columns)) dfShape(tendulkar1)
Data frame columns – Python
Data frame columns – Pyspark
Dtypes – Python
Select columns – Python
Select columns – Pyspark
Filter rows by criteria – Python
b = tendulkar['Runs'] >50 df = tendulkar[b] df.head(10)
Filtering by condition – Pyspark
Display unique contents of a column – Python
tendulkar = pd.read_csv("/dbfs/FileStore/tables/tendulkar.csv", header='infer') tendulkar['Runs'].unique()
Display unique contents of a column – Pyspark
from pyspark.sql.functions import * tendulkar1= (sqlContext .read.format("com.databricks.spark.csv") .options(delimiter=',', header='true', inferschema='true') .load("/FileStore/tables/tendulkar.csv")) tendulkar1.select('Runs').rdd.distinct().collect()
Aggregate mean, max, min – Python
#tendulkar[['Mins','BF','Runs']].groupby('Runs').mean() # Remove rows which have DNB import pandas as pd import os os.getcwd() tendulkar = pd.read_csv("/dbfs/FileStore/tables/tendulkar.csv", header='infer') tendulkar.shape a=tendulkar.Runs !="DNB" tendulkar=tendulkar[a] tendulkar.shape # Remove rows which have TDNB b=tendulkar.Runs !="TDNB" tendulkar=tendulkar[b] tendulkar.shape # Remove the '*' character c= tendulkar.BF != "-" tendulkar=tendulkar[c] tendulkar.Runs= tendulkar.Runs.str.replace(r"[*]","") #tendulkar.shape type(tendulkar['Runs']) tendulkar['Runs']=pd.to_numeric(tendulkar['Runs']) tendulkar['BF']=pd.to_numeric(tendulkar['BF']) df=tendulkar[['Runs','BF','Ground']].groupby('Ground').agg(['mean','min','max']) df.head(10)
Aggregate mean,min max – Pyspark
from pyspark.sql.functions import * tendulkar1= (sqlContext .read.format("com.databricks.spark.csv") .options(delimiter=',', header='true', inferschema='true') .load("/FileStore/tables/tendulkar.csv")) tendulkar1= tendulkar1.where(tendulkar1['Runs'] != 'DNB') print(dfShape(tendulkar1)) tendulkar1= tendulkar1.where(tendulkar1['Runs'] != 'TDNB') print(dfShape(tendulkar1)) tendulkar1 = tendulkar1.withColumn('Runs', regexp_replace('Runs', '[*]', '')) tendulkar1.select('Runs').rdd.distinct().collect() from pyspark.sql import functions as F df=tendulkar1[['Runs','BF','Ground']].groupby(tendulkar1['Ground']).agg(F.mean(tendulkar1['Runs']),F.min(tendulkar1['Runs']),F.max(tendulkar1['Runs'])) df.show()
Great achievements are the sum of small, tiny victories. It is important that we take tiny steps towards our goals in life. The Chinese adage, “The journey of a thousand miles begins with a single step” is both real and true. We need to make a start. We need to act.
Think big, but start small. Great achievements of a Michael Jordan, Sachin Tendulkar or a Tiger Woods are the results of tiny successes and small victories along the way. We may not get to the very top but still we have significant impact in what we do.
There are several kinds of people in this world.
There are the ‘general drifters”. These people have no goals and no visions. They generally drift about in life. Since they are not headed anywhere they reach nowhere.
Then there are the ‘self-doubters’. These people doubt their own abilities even before they get started. They hobble themselves with the chain of self-doubt that they make little or insignificant progress.
There are then the ‘oversized goal seekers’. These people want the moon and want it now. These people set themselves unrealistic goals in life. As a result they feel guilty for not achieving their targeted goals.
What is important is that we set ourselves realistic and achievable goals. When we succeed in these little goals that we set for ourselves then we add to our ‘fund of self-esteem’ as Stephen Covey states in his book ‘Seven habits of …”. We need to take the first step, but we need to take it in the direction of a small and achievable target.
It is perfectly fine to fail. It is perfectly fine to make mistakes. Neil Gaiman says “Make New Mistakes. Make glorious, amazing mistakes. Make mistakes nobody’s ever made before. Because if you are making mistakes, then you are making new things, trying new things, learning, living, pushing yourself, changing yourself, changing your world. You’re doing things you’ve never done before, and more importantly, you’re Doing Something.”. You can read more at “Make good art”.
The small successes that we achieve tend to not only increase our confidence but also motivate us to do even better. In other words it forms a virtuous cycle where we are goaded to excel.
It is never too early to start working towards your goals. Rather than waiting for an auspicious time, get started right away.
More importantly it is never too late also too start. So whether you are in your teens or on the ‘other’ side of 40, it is not late to begin. You have still have a lifetime to achieve your goals.
In others words, you can still ‘conquer the world!’
The ability to learn is a learned ability. While the previous sentence may appear be circular, in reality it is not. What I am saying is that learning, like swimming, is an ability that can be learned. It is not something that we are born with.
Learning like any other ability can be cultivated, honed and sharpened.
There are no stupid people. In my opinion people are either lazy thinkers, erroneous thinkers or distracted thinkers.
We usually tend to assume that when a child or a person has difficulty to understand or solve something that he/she is innately stupid and nothing can be done about it.
Some people have difficulty because they do not want to exercise their grey cells. Absorbing, thinking and analyzing require effort much the same way like lifting a heavy object. Some people are averse to putting in that effort and try to take the easy way out by guessing or randomly choosing a possibility which invariably is wrong.
There are others who follow erroneous thinking. Here, I have observed is that people tending to rush to understand. This cannot happen. The right amount of effort in the right amount of time is necessary to understand and absorb. If this is not done then nothing can be achieved. Here I am reminded of people trying to learn swimming. In their haste of swimming like a pro they thrash and splash the water. The only result is a lot of water being splashed all around with little or insufficient movement.
Then there are others who try too hard to learn. This is not going to work for obvious reasons because “if you stare you will not see”. The approach to learning must be relaxed and easy.
Also, what is needed is focus and concentration. We must have adequate attention on what we are trying to learn. There should be no constraints of time or quantity. We should go with the flow.
Finally there must be a strong desire and an urge to learn.
So, in conclusion, learning to learn is a learned ability. We need focus, concentration, a relaxed approach and adequate effort.
If these are followed learning will not only be easy but also enjoyable!