Brian ONeill's Random Thoughts: February 2015

Monday, February 9, 2015

Our dream home, and a monthly mortgage rate calculator in ruby!

We just finished the construction of our dream home:

Now that construction is complete, we are converting from a construction loan into a normal mortgage, refinancing to get the best rate. We are doing all the normal trade-offs, and being the geek that I am, I wanted to have a deeper look to find out if paying points was worth it, when the break-evens would be, etc.

So, I decided to write a short little ruby program that I could play around with.

It turns out that the math behind the monthly payment is actually kind of neat. Have a look at the derivation for a fixed rate mortgage:
http://en.wikipedia.org/wiki/Fixed-rate_mortgage

With the math already done, it was a simple problem to code. Here is my work:

def money (x)
  sprintf("%05.2f", x)
end

balance = 500000
annual_rate = 3.25.to_f * 0.01
monthly_rate = annual_rate / 12.to_f
n = 15*12

puts "Loan term (Number of payments) [#{n}]"
puts "Annual interest rate [#{annual_rate*100}]"
puts "Monthly interest rate [#{monthly_rate}]"
term = (1 + monthly_rate)**n
puts "Term = [#{term}]"

monthly_payment = balance * (monthly_rate * term / (term - 1))
puts "Monthly Payment = [#{money(monthly_payment)}]"

while (balance > 0)
  interest_payment = balance * monthly_rate
  principal_payment = monthly_payment - interest_payment
  balance = balance - principal_payment
  puts("Interest [$#{money(interest_payment)}], Principal [$#{money(principal_payment)}], Balance = [$#{money(balance)}]")
end

Up top of the ruby are all the parameters. For an example, I used $500K @ 3.25% for 15 years. (n = 15*12, where n is the number of months for the loan)

Then, you will see the loop where it outputs the principle payment, the interest payment, and the remaining balance. For each month, your interest payment is always the interest rate multiplied by the remaining balance. Then, whatever the difference is between that and the monthly payment is what gets applied to the principal.

It's kind of neat, and you can run multiple scenarios based on different initial outlays and percentage rates.

Hopefully someone finds this useful. If not, no worries, it was my mental yoga for the day.

Thursday, February 5, 2015

Absolute Truths, Perspectives, and Parellelism

We have a product called VerifyRx. Chances are, when you go to the pharmacy and hand over your prescription, our data is being used to verify that your doctor was eligible to write that prescription. We deliver this functionality as a web service. And because people will wait at the counter until our web service comes back, you can imagine the kinds of SLAs we are under. We are live/live across two data centers, servicing millions of transactions each day, with real-time replication across those data centers (courtesy of C* =). For compliance/auditing purposes, we store the results of those transactions in perpetuity, supporting ad hoc query capabilities, with about 5B records at the users finger tips (courtesy of ElasticSearch).

And while this was technically challenging, I don't believe it is the hardest problem that we've had to solve over the years. I say that because each transaction can run in parallel.

f(x, MF) -> y, where all x's are independent.

In the equation, there is only one source of the truth, the absolute truth, which we call our Master file (MF). That master file contains the most accurate and current information available anywhere about a doctor. And the masterfile changes, but not based on the transaction.

So, if that is easy, what's hard?

Well, problems that have dependent functions are difficult to parallelize. What if we want a real-time system to detect fraud? We want to count the number of prescriptions written by a doctor for controlled substance. Now, we would have something like:

f(x', MF) -> f(x, MF) + 1

People that have toyed with this problem know that the CAP theorem makes distributed counting hard. To solve this problem, the system must not only service transactions, it must be transactional. (emphasis on the AL)

In fact, mutability in general in a distributed system is difficult. In the old days of relational databases, you would simply start a transaction (in the "transactionAL" sense of the word), acquire a lock, and happily, safely update the data. But we know that locking mechanisms don't scale. (and in general, they can be a royal PITA sometimes -- I spent my week diagnosing an Oracle/Hibernate locking issue in production)

Nowadays, we lean on consensus algorithms and conditional updates. If two systems attempt to update the same data/count, one will win. The other will lose and retry. (See: http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0)

Conditional updates enable mutability at scale, but what if you want your customers to be able to define how data is counted/aggregated?Effectively, you want them to supply the functions and the "master version of truth". You also want them to be able to alter that truth transactionally? Well, then you would have:

c(x', mf') -> c(x, mf)...

Now you've got a challenge. And now you've got Master Data Management (MDM).

Our platform allows our customers to build highly interactive, customizable universes of data and analytics.

So, it's one thing to handle transactions, and an entirely different thing to handle them transactionally. =)

BTW -- if you are looking for a challenge, let me know. We are currently looking for a lead developer that wants to come play with us.