I was wondering if there are any ways to simplify the following piece of Code. As you can see, there are numerous dicts being used as well as condition statements to weed out bad input data. Note that the trip rate values are not all inputed yet, the dicts are just copied and pasted for now
EDIT
In any of the rates, (x,y):z . x and y are correct, the z values are not as they're just copy/pasted
this code works in case you want to copy, paste, and test it
import math
# step 1.4 return trip rates
def trip_rates( population_stratification, analysis_type, low_income, medium_income, high_income ):
''' this function returns the proper trip rate tuple to be used based on input
data
ADPT = Average Daily Person Trips per Household
pph = person per household
veh_hh = vehicles per household
(param_1, param_2): ADPT
'''
li = low_income
mi = medium_income
hi = high_income
# table 5 -
if analysis_type == 1:
if population_stratification == 1:
rates = {( li, 1 ):3.6, ( li, 2 ):6.5, ( li, 3 ):9.1, ( li, 4 ):11.5, ( li, 5 ): 13.8,
( mi, 1 ):3.9, ( mi, 2 ):7.3, ( mi, 3 ):10.0, ( mi, 4 ):13.1, ( mi, 5 ): 15.9,
( hi, 1 ):4.5, ( mi, 2 ):9.2, ( mi, 3 ):12.2, ( mi, 4 ):14.8, ( mi, 5 ): 18.2}
return rates
if population_stratification == 2:
rates = {
( li, 1 ):3.1, ( li, 2 ):6.3, ( li, 3 ):9.4, ( li, 4 ):12.5, ( li, 5 ): 14.7,
( mi, 1 ):4.8, ( mi, 2 ):7.2, ( mi, 3 ):10.1, ( mi, 4 ):13.3, ( mi, 5 ): 15.5,
( hi, 1 ):4.9, ( mi, 2 ):7.7, ( mi, 3 ):12.5, ( mi, 4 ):13.8, ( mi, 5 ): 16.7
}
return rates
if population_stratification == 3: #TODO: input actual rate
rates = {
( li, 1 ):3.6, ( li, 2 ):6.5, ( li, 3 ):9.1, ( li, 4 ):11.5, ( li, 5 ): 13.8,
( mi, 1 ):3.9, ( mi, 2 ):7.3, ( mi, 3 ):10.0, ( mi, 4 ):13.1, ( mi, 5 ): 15.9,
( hi, 1 ):4.5, ( mi, 2 ):9.2, ( mi, 3 ):12.2, ( mi, 4 ):14.8, ( mi, 5 ): 18.2
}
return rates
if population_stratification == 4: #TODO: input actual rate
rates = {
( li, 1 ):3.1, ( li, 2 ):6.3, ( li, 3 ):9.4, ( li, 4 ):12.5, ( li, 5 ): 14.7,
( mi, 1 ):4.8, ( mi, 2 ):7.2, ( mi, 3 ):10.1, ( mi, 4 ):13.3, ( mi, 5 ): 15.5,
( hi, 1 ):4.9, ( mi, 2 ):7.7, ( mi, 3 ):12.5, ( mi, 4 ):13.8, ( mi, 5 ): 16.7
}
return rates
#table 6
elif analysis_type == 2:
if population_stratification == 1: #TODO: Change rates
rates = {
( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
}
return rates
if population_stratification == 2: #TODO: Change rates
rates = {
( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
}
return rates
if population_stratification == 3: #TODO: Change rates
rates = {
( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
}
return rates
if population_stratification == 4: #TODO: Change rates
rates = {
( 0, 1 ):3.6, ( 0, 2 ):6.5, ( 0, 3 ):9.1, ( 0, 4 ):11.5, ( 0, 5 ): 13.8,
( 1, 1 ):3.9, ( 1, 2 ):7.3, ( 1, 3 ):10.0, ( 1, 4 ):13.1, ( 1, 5 ): 15.9,
( 2, 1 ):4.5, ( 2, 2 ):9.2, ( 2, 3 ):12.2, ( 2, 4 ):14.8, ( 2, 5 ): 18.2,
( 3, 1 ):4.5, ( 3, 2 ):9.2, ( 3, 3 ):12.2, ( 3, 4 ):14.8, ( 3, 5 ): 18.2
}
return rates
# table 7
elif analysis_type == 3:
if population_stratification == 1: #TODO: input actual rate
rates = {
( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
}
return rates
if population_stratification == 2: #TODO: input actual rate
rates = {
( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
}
return rates
if population_stratification == 3: #TODO: input actual rate
rates = {
( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
}
return rates
if population_stratification == 4: #TODO: input actual rate
rates = {
( li, 0 ):3.6, ( li, 1 ):6.5, ( li, 2 ):9.1, ( li, 3 ):11.5,
( mi, 0 ):3.9, ( mi, 1 ):7.3, ( mi, 2 ):10.0, ( mi, 3 ):13.1,
( hi, 0 ):4.5, ( mi, 1 ):9.2, ( mi, 2 ):12.2, ( mi, 3 ):14.8
}
return rates
def interpolate( population_stratification, analysis_type, low_income, medium_income, high_income, x, y ):
#get rates dict
rates = trip_rates( population_stratification, analysis_type, low_income, medium_income, high_income )
# dealing with x parameters
#when using income levels, x_1 and x_2 are li, mi, or hi
if analysis_type == 1 or analysis_type == 2 or analsis_type == 4:
if x < high_income and x >= medium_income:
x_1 = medium_income
x_2 = high_income
elif x < medium_income:
x_1 = low_income
x_2 = medium_income
else:
x_1 = high_income
x_2 = high_income
if analysis_type == 3:
if x >= 3:
x_1 = 3
x_2 = 3
else:
x_1 = int( math.floor( x ) )
x_2 = int( math.ceil( x ) )
# dealing with y parametrs
#when using persons per household, max number y = 5
if analysis_type == 1 or analysis_type == 4:
if y >= 5:
y_1 = 5
y_2 = 5
else:
y_1 = int( math.floor( y ) )
y_2 = int( math.ceil( y ) )
elif analysis_type == 2 or analysis_type == 3:
if y >= 5:
y_1 = 5
y_2 = 5
else:
y_1 = int( math.floor( y ) )
y_2 = int( math.ceil( y ) )
# denominator
z = ( ( x_2 - x_1 ) * ( y_2 - y_1 ) )
result = ( ( ( rates[( x_1, y_1 )] ) * ( ( x_2 - x ) * ( y_2 - y ) ) / ( z ) ) +
( ( rates[( x_2, y_1 )] ) * ( ( x - x_1 ) * ( y_2 - y ) ) / ( z ) ) +
( ( rates[( x_1, y_2 )] ) * ( ( x_2 - x ) * ( y - y_1 ) ) / ( z ) ) +
( ( rates[( x_2, y_2 )] ) * ( ( x - x_1 ) * ( y - y_1 ) ) / ( z ) ) )
return result
#test
low_income = 20000 #this is calculated using exchange rates
medium_income = 40000 # this is calculated using exchange rates
high_income = 60000 # this is calculated using exchange rates
population_stratification = 1 #inputed by user
analysis_type = 1 #inputed by user
x = 35234.34 #test income
y = 3.5 # test pph
print interpolate( population_stratification, analysis_type, low_income, medium_income, high_income, x, y )
Best Solution
Well, where to start? Here is just a first observation:
You have a lot of data there, and it seems code and data are mixed into each other.
Data and Code should be separate. Data is an external source, something you modify or read in. You could probably adapt your code to quickly parse Data from a good editable representation to a representation useful for your algorithms. I suspect your code will be shorter, clearer, and less error prone (did you notice all of the 'rates' dictionaries have multiple keys, and you miss a lot of 'hi' keys?).
If you need better abstractions such as matrices and arrays of data, look into
numpy
Edit 1
Did you count your number of dimensions? You have a many-dimensional matrix here with X dimensions: analysis_type, population_stratification, income_level, index
If I see right this is a 3x4x3x3 (= 108 entries) "matrix" or "lookup table". If this is the data your model builds on, fine. But can't you put those numbers in a file, or table that you read in? Your code would be next to trivial.
Edit 2
Ok, I'll bite for some minor python style: Testing for values in a Set or a Range.
Instead of:
you can use
or even using readable names as (CUBIC, ..) as suggested.
Instead of:
you can used chained conditions; Python is one of the few programming languages where conditions chain to make nautral if statements:
Edit 3
More important than small code figures is of course code design and refactoring. Edit 2 can only give you some polish.
You should learn to loathe duplicate code.
Also, you have quite a lot of branches in one function. That is a good sign you should break it up into multiple functions. It can also reduce duplication. For example, when one variable like
analysis_type
can totally change what the function does, why have two different behaviors in one function? You shouldn't have the whole program in one function. Perhaps analysis_type == 3 is better expressed in its own function (as an example)?Do you understand that your function
trip_rates
basically does an array lookup, where the array lookup is hardcoded as if ..: return .. if : return .., and the array is written out in full in the function? What iftrip_rates
could be implemented like this? Would it be possible?