-->
Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Monday, January 5, 2015

how to print percentages in python

Very simple pattern to show your percentages into a printed string
>>> print "%.0f%%" % (100.0 * 1/3)
33%

Tuesday, August 19, 2014

Switch your EC2 instance from command line with python and BOTO

First thing first: security. Even before starting messing around with your AWS credentials make yourself sure to get the required precautions.

You dont want to wake up and find that someone has launched 60 c3.x8 servers across 5 regions using the credentials that you left in a backed up directory...

First thing create a user if you dont have yet:  go in the IAM console, create new user and that's it ..

Then the more interesting part: associate a user policy to that user. This will limit the amount of messing that the user can do into your AWS account.


  1. click on the user -> in users policies -> click on "Attach User Policy"
  2. Click on custom policy generator
  3. give your policy a name (e.g. 'my-restricted-policy')
  4. copy and paste the following policy 
{
   "Version": "2012-10-17",
   "Statement": [{
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances", "ec2:DescribeImages",
        "ec2:DescribeAvailabilityZones",
        "ec2:StopInstances", "ec2:StartInstances"
      ],
      "Resource": "*"
    }
   ]
}

You are almost done: you just need to generate your security credentials for the user with the limited policy generator. 

So click on the user -> select Security Credentials, click on Manage Access Keys and then Create access key. 

At this point you should have something that looks like the following

aws_access_key_id        = 'DSFSDFSDFWEFEWF'
aws_secret_access_key  = 'ldfjjs8wnoliencdnscdmsfkmsdkfml32'


First part is done, now a few more lines of python and that 's it : 
import boto.ec2


name_of_the_instance = 'put-here-the-name-of-your-instance'
name_of_the_region ='put-here-the-name-of-the-region-where-the-machine-is-located-eg-eu-west-1' 

# access key only for on / off
aws_access_key_id_     = 'DSFSDFSDFWEFEWF'
aws_secret_access_key_ = 'ldfjjs8wnoliencdnscdmsfkmsdkfml32'


conn = boto.ec2.connect_to_region(region_name, 
                                  aws_access_key_id     = aws_access_key_id_,
                                  aws_secret_access_key = aws_secret_access_key_ )

inst = conn.get_all_instances(filters={'tag:Name': name_of_the_instance})[0].instances[0]
print inst.stop()

Friday, April 25, 2014

filling the beautiful (and damn) nparray in a loop

Simple, first append vector by vector into a list put everything in a list
 
tmp_values = []

for i in myiterations: 
    tmp_list.append(s_tr.values)
and then convert the list into a numpy array
tst=np.array(tmp_list,dtype=float)
tst[:,1]

Wednesday, September 18, 2013

Super-simple Lookup table from textual file


import sys                                                                                                                                                                                                                                
                                                                                                                                                                                                                                           
# Usage:                                                                                                                                                                                                                                  
# python lut.py                                                                                                                                                                                                          
#     : file containing two columns, key and value
# : file containing the keys to retrieve the corresponding values
                                                                                                                                                                                                                                           
lut= dict([f.strip().split() for f in open(sys.argv[1],'r').readlines()])                                                                                                                                                                  
lines=[l.strip().split()  for l in open(sys.argv[2]).readlines()]                                                                                                                                                                          
                                                                                                                                                                                                                                           
for l in lines:                                                                                                                                                                                                                            
    print lut.get(l[0])

Saturday, August 10, 2013

figures in python notebook


just remember to issue as first command

ipython notebook --pylab inline

Thursday, June 6, 2013

Sunday, May 19, 2013

tuples

They are immutable lists

Lists


Lists are ordered collections of items: they are the arrays of python but better shaped. They are mutable ( they can be changed in place)

good practices: 
-always use append
- remember that operations like sort change the list object in-place which means that : 

given : 
l.sort()
you dont need to copy it into another list

# super simple way to deal with  list ( array ) of values


tpr1=[]                                                                                                       
fpr1=[]                                                                                                       
for th in arange(1,10,1):                 
    tpr1.append(th)                                                                       
    fpr1.append(th)                                                                       

but if you want performances use dictionaries:

tpr1={}
c=0
for th in arange(1,10,1):                 
   tpr1[c]=th
   c+=1

.. to pipe in text into a python script..




import sys
data = sys.stdin.readlines()
print "Counted", len(data), "lines."

deal with classification scores obtained with a log loss


# script for sigmoid transform and L1 normalization of  classification scores obtained with log loss

from numpy import loadtxt,savetxt, exp, tile,transpose, size, sum                                                              
from numpy.core.fromnumeric import size                                                                                        
import sys,os                                                                                                                  
                                                                                                                               
filename_f=sys.argv[1]                                                                                                        
                                                                                                                               
# load matrix of scores NxM ( N= images, M=categories)                                                                        
f=loadtxt(filename_f)                                                                                                          

# sigmoid to get probabilities                                                                                                
pf=1./(1+exp(-f))                                                                                                              

# L1 normalization                                                                                                            
pf_n=pf / transpose( tile( sum(pf,1), (size(f,1),1) ) )                                                                        
                                                                                                                               
filename_pf_n=os.path.splitext(filename_f)[0]+'Norm'+os.path.splitext(filename_f)[1]                                          

savetxt(filename_pf_n,pf_n,fmt='%.6e')                                                                                        
                                             

NOTE WELL:

a. import all the fuctions you are using from numpy ( it is faster
b. to extract path and extension use os.path.splitext

Dictionaries


Dictionaries are unordered collection of items that are stored and fetched by key!

c={} dictionary
c=[] list

c=dict(enumerate(set(open('/home/lmarches/cvpr13/visual/features_ml/lists_mlfu300/class.txt').read().split())))

set = get unique elements
enumerate = generate the sequence
dict= assembles the dictionary

 433: 'you_need',
 434: 'your_camera',
 435: 'your_entry',
 valuekey

if you want to invert :

;c_inv= dict([(value, key) for (key, value) in c.iteritems()] )

'you_need':  433
'your_camera', 434
'your_entry', 435 

if you want to search by value

[key for key,value in c.items() if value=='your_entry' ][0]

435

Strings: indexing, single, double and triple quotes...


For indexing remember that you can index from the start and from the end of a string: 

s='luca'
print s[:2]
print s[:-2]

the triple quotes can be useful in some cases, when you need to include really long strings (e.g. containing several paragraphs of informational text), it is annoying that you have to terminate each line with \n\, especially if you would like to reformat the text occasionally with a powerful text editor like Emacs. For such situations, ``triple-quoted'' strings can be used, e.g.


        hello = """

            This string is bounded by triple double quotes (3 times ").
        Unescaped newlines in the string are retained, though \
        it is still possible\nto use all normal escape sequences.

            Whitespace at the beginning of a line is
        significant.  If you need to include three opening quotes
        you have to escape at least one of them, e.g. \""".

            This string ends in a newline.
        """


Monday, May 13, 2013

intersect two list of files

useful check to ensure that there's zero overlap between training/ test/ val splits:


set(open('testFree.jpgl')) & set(open('trainFree.jpgl'))


if you put into a bash script it should look like this


#!/bin/bash                                                                                                                                                                                                                                 
echo -e "for i in set(open('$1')) & set(open('$2')):print i" |  python   


credit: @larsmans in stackoverflow 

Wednesday, April 24, 2013

stats about my blog entries


from scipy import signal                                                                                                                                  
import re                                                                                                                                                  
g=[]                                                                                                                                                      
g11=[]                                                                                                                                                    
for bl in re.findall('\\\\begin{BLOG}.*?\\\\end{BLOG}',open('2012_log.tex').read(),re.DOTALL):g.append(len(bl))                                            
for bl in re.findall('\\\\begin{BLOG}.*?\\\\end{BLOG}',open('../2011/all_test_.tex').read(),re.DOTALL):g11.append(len(bl))                                
g11.reverse()                                                                                                                                              
g11_p=hstack((zeros(15),g11))                                                                                                                              
plot(range(0,len(g11_p)),g11_p,range(0,len(g11_p)),signal.cspline1d(array(g11_p,dtype='float'),10.0))                                                      
plot(range(0,len(g)),g,range(0,len(g)),signal.cspline1d(array(g,dtype='float'),10.0))                                                                      
plot(range(0,len(g11_p)),signal.cspline1d(array(g11_p,dtype='float'),10.0),range(0,len(g)),signal.cspline1d(array(g,dtype='float'),10.0))

Saturday, April 20, 2013

dealing with matrices "a la matlab"


Given a list :

your_list=[]

you play with your list :

your_list.append(3)

Most important thing is to handle it as a numpy array:

your_array_np = np.array(your_array,'dtype=int8')

and then you can do fancy things like indexing it using vectors of indexes, exactly as matlab:


gt=[]                                                                             
                                                                                  
c_idx=1                                                                           
for l in labl:                                                                    
    if (len(re.findall(classtxt[c_idx],l))>0):                                    
        gt.append(1)                                                              
    else:                                                                         
        gt.append(0)                                                              
                     

gta=np.array(gt,dtype='int8')

len(find(gta[find((gta[:]-int8(sgdMat[:,c_idx]>th))==0)]!=0))

Monday, April 15, 2013

check if file exists


try:
   with open('filename'): pass
except IOError:
   print 'Oh dear.'

Matching strings in python

use is or == but be careful about

empty spaces => ???
newlines  = > use rstrip()and lstrip()


An even better strategy from stackoverflow

s = s.strip(' \t\n\r')
This will strip any space, \t, \n, or \r characters from the left-hand side, right-hand side, or both sides of the string


to debug string matching always print with paddings 


    print "[DEBUG]: \nnew :"+reservation_new_time+"<"+"\nstd :"+reservation_stored+"<"+" cmp: "+ str(cmp(re\
servation_new_time,reservation_stored))  

Friday, March 22, 2013

sparse matrices from matlab to python


#Load your matlab matrix in python
testFree=scipy.io.loadmat('TestFreeSparse.mat')

#get the matrix
Test=testFree['TestSparse']

#to dense
TestDense=Test.todense()

# get the indexes of non-zero values
I,J= Test.nonzero()


# get the data
V=Test.data

G = scipy.sparse.csr_matrix((V,(I,J)))
G[0,112342]

Monday, March 11, 2013

The 3 things you absolutely need to know about regexp

1. By default all qualifiers are greedy  which means that they match as more text as they can! However if you want to match several instances of the same pattern add the ? identifier

re.findall('.*?,page)

2. By default newlines are not matched in std regexp so you can:

either remove the newlines with

re.findall('\begin{itemize}.*?\end{itemize}', page.replace('\n', '')

or

re.findall('\begin{itemize}.*?\end{itemize}', page, re.DOTALL)

3. To get a number (float or integer) you can use again ? but this time to make a character optional:

re.findall('r'\d+\.?\d+',page)