Java Data Structures
C# Data Structures
Gov & Misc Docs
App Dev (old)
|Welcome to www.theparticle.com.
It's the newest pre-IPO dot bomb that's taking the world by storm.
Now is a perfect time to buy lots of worthless and overpriced shares!|
Internet is becoming more and more polluted with
junk-mail, people selling crap, and businesses which don't know their place on the net.
They're all trying to make this wonderful place (i.e.: the net) in to hell (i.e.: real
world). Internet should be viewed as a place of imagination, creativity, and most of all:
fun. Internet is not some really advanced tool for searching for people to rip-off. It's
about searching, and finding, things which are useful, helpful, and promote the sharing of
ideas. This is what this site is striving to become.
News, Updates, & Rants...
Another day of running around. Shopping.
- Alex; 20171107
One tough day... running around a lot.
- Alex; 20171106
At exactly daylight-savings clock-change (at 2:00am, when they move clock back to 1:00am), my son Ian was born. So he was `just-born' AND `1-hour old', at the same time on the same day :-)
Happy Birthday kid!
It's interesting to note that my b-day is 10/5, and my kid's is 11/5. And today is a prime day, 11 is prime, 5 is prime, 17 is prime, and 2017 is prime :-)
- Alex; 20171105
- Alex; 20171104
Interesting things are happening. Keeping fingers crossed.
- Alex; 20171103
Finally got the Pixel 2 :-)
- Alex; 20171019
Happy B-Day to yours truly :-D
Other bdays: Slashdot turned 20 years old today. The first IBM Thinkpad was released exactly 25 years ago today.
- Alex; 20171005
LIGO detected yet another black hole merger!
Soon: gravitational wave telescopes; each detector now is just a single ``pixel''---if we had thousands of said detectors (and if they were a bit more sensitive), we'd be able to ``observe'' the universe using gravitational waves...
- Alex; 20170927
9/11: It's that time of the year again.
16 years later, and the rebuilding effort at WTC is finally starting to look like it's actually happening. The memorial site is full of tourists, and there's lot of new construction in the area.
The Oculus windows apparently open up for 9/11---but only for a short time.
- Alex; 20170911
Went to hike Mnt.Washington. Awesome hike. Lots of people. In fact, I've never seen that many hikers on that trail. Ever. The main parking lot got full. Then overflow parking lot got full. Then the other overflow parking lot had a few spaces left (that's where I parked), but then even that filled up. By the time I got back from the hike, there was about half a mile of cars parked on the side of the road, in addition to full parking lots. Yea, very full.
Did the usual loop: up Boot Spur trail, then camel trail to the lake, then summit, then down via Tuckerman. About 3.5 hours up, and 2 hours down.
There's snow on the trail!
- Alex; 20170902
Finished reading The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling by Ralph Kimball, Margy Ross. This book is awesome. I wish I read it a while back---like 15 years ago. Can't really say what I learned form it, but it's the subjects that any data modeler will encounter, and this book offers a few perspectives on different ways of approaching modeling decisions.
- Alex; 20170825
It seems many folks don't really understand the concept of freedom of speech. In short, it means folks have the right to offend you with their speech (their opinions, their views, etc.).
- Alex; 20170823
A Solution of the P versus NP Problem by Norbert Blum. Seems clever, but it falls into the same trap as my attempt from about 20 years ago: proving lower bound on anything is pretty much impossible, and no, the paper doesn't prove it.
UPDATE 2017-08-31: The author of the paper now says: ``The proof is wrong.'' (in coments section).
- Alex; 20170815
Attended the AWS Summit at Jacob Javits Center today. Got 7 t-shirts :-)
Besides free t-shirts, can't really say there was anything useful said or presented. It's all advertising, corps selling ETL solutions, migration solutions, log analysis solutions, etc., It's all about ``using our tool you can blah blah blah''... not actually useful stuff, like ``you can make this practical by such and such non-intuitive method.''
- Alex; 20170814
Finished Sapiens: A Brief History of Humankind by Yuval Noah Harari. This is an awesome book. I wasn't expecting much, but this book is filled with history, philosophy, and lots of insights. Definitely recommend it.
- Alex; 20170812
I know what's inside black holes: Inside every black hole, is... another black hole, except with 1-quantum smaller surface area. What's inside that inner black hole? Yet another black hole. It's black holes all the way down. What happens when you get to a black hole so tiny that it just cannot have anymore black holes in it? You've reached the 1-quantum sized black hole, and it's not a black hole at all---it's just a quantum of energy.
Can anything leave a black hole? According to popular belief, no. ``Not even light.''
Then we got that Hawking radiation thing going---where a virtual particle pair just randomly appears near the event horizon. One of the virtual pair particles falls into the black hole, and the other one speeds away in the opposite direction---making it appear like it came from within the black hole. The virtual particle that falls in actually decreases the total mass of the black-hole-and-orbiting-particle---so the black hole appears to have lost mass. So it really looks like the black hole emitted a particle and lost mass as a result. According to Hawking, this is the primary mechanism by which black holes appear to lose mass and eventually evaporate. Unfortunately this mechanism causes a host of problems.
The primary problem with Hawking radiation evaporating black holes is that the radiation is random. It's random virtual particle-pairs, where one happens to fall into the black hole. It's not something that originates within the black hole. So the black hole evaporates without somehow any of its contents leaving it. This process does not appear reversible---all physics laws (as far as we know) are fully reversible, even quantum ones, and black holes appear to violate that. We cannot replay physics backwards---as the stuff that falls into the black hole never leaves it.
So lets pretend that Hawking radiation is the wrong process. We know black holes have a temperature, but how else can they radiate away their mass? In other words, how would they have a temperature without any of its contents leaving?
Apparently nobody knows. Really. Even Hawking is now backpedaling the whole destruction of information that Hawking radiation implied.
One possibility is quantum uncetainty: We cannot, with certainty, determine the position of any particle. Not even when that position in inside the black hole. In other words, we may be pretty sure that the particle is inside the black hole, but there exists a non-zero probability that it isn't. So the particle that fell into a black hole may be detected outside the black hole. Neat, no?
There's a catch: the bigger the black hole, the less likely this would be. So the bigger the black hole, the less rapidly its particles would tunnel out of it---perhaps with rates similar to Hawking radiation. In fact, observationally, the process may look identical to Hawking radiation---but since the particles are now coming from the inside, the information of stuff that fell in is preserved. It's irrelevant that we cannot observe the process, we just know that eventually everything that went in will come out (in scrambled state, but still come out).
Where am I going with all this? I believe black holes is just another method of burning mass/matter.
For example, chemical fires (for example, a candle) converts one kind of molecule into other kinds of molecules, with less chemical bond energy. If we capture all the inputs and outputs of the process, the resulting products will have a slightly lower mass---with escaped heat (light) being the difference. Now, classic physics tells us that the process is reversible. We can burn a piece of paper, capture the gasses, etc., and reconstruct it back. But really we cannot---the light that escapes is out forever. There's nothing that can push that light into the ashes and reconstitute the paper. (e.g. how would such light photons be captured or even observed without disturbing them?). So stuff (chemical bond energy) is burned and it's gone forever... Note we're not talking about the ash here, but the heat that escapes.
Yes perhaps we can reconstruct the paper from the ashes, but that would require way more energy than was released from the one-way-burn process.
We can call this increasing disorder entropy, but entropy has other meanings, so don't want to overload that term.
Anyways, chemical fires burn chemical bond energy---the light goes out into the universe never to be seen again (at least not unless that light collides with something).
Stars burn atoms. Instead of chemical bonds, they burn strong nuclear forces. The resulting elements have less mass than the products---again, the heat escapes out into the universe. This process can continue from hydrogen all the way upto iron. Iron has the lowest energy, so stars cannot burn iron---kind of like fire cannot burn ashes. As stars explode, some heavier elements form, and those eventually decay as well, as they're not stable.
So non-nuclear leaves us with molecules. Stars leave us with iron. What's beyond? If we push more matter into a dying star, it will collapse into a white dwarf. That's just a ball of compressed matter that is being held up by electron forces---it is not burning matter. It's just hot due to the original star, but it's not actively turning matter into energy. It's cooling off, but that's it. Next scale is a neutron star---one that collapses beyond electron forces and is being held by neutrons not wanting to be inside each other. Again, these stars are just hot remnants and aren't actively turning matter into energy. They will eventualy cool off, etc.
Next step beyond neutron stars is black holes. These apparently do evaporate (using whatever process)... meaning that they consume matter and turn it into heat. This is burning matter into energy again---same as the sun, except it's happening at such ridiculous conditions that we're not allowed to see it.
Why would this seem right at all? For one, nobody really knows what's going on inside a black hole, and this guess is as good as any.
The other is what's going on inside our sun: two hydrogen atoms fuse to form helium. The energy/light from this reaction doesn't just leave the sun. Because the center is very dense, it takes light hundreds of thousands of years of bouncing around to reach the outer layer of the sun to escape. Now, imagine if the sun was bigger, it would take longer. If the sun was denser, it would take even longer. It's not hard to imagine where this ``time'' would appear to be infinite. In other words, a fusion reaction at the center of the sun generating energy, and due to bouncing around that photon practically never leaving the sun---that would be a black hole.
- Alex; 20170810
After a really long and tiring hike up the south rim trail, got to the summit around 8:30am: about 4.5 hour hile. This trail should really take a lot less time. Last year in July, I did this exact ascent in 3 hours 10 minutes. This year, the restaurant food at the north rim caused me a great deal of problems, so was extremely tired and hungry. But yey, got out of the canyon!
Then drove directly to Pilot gas station. There's one on the way to Flagstaff. Got breakfast, took a shower, and then decided to be a tourist and just drive around. First stop: Meteor Crater.
On the drive to Meteor Crater, noticed a road sign for Walnut Crater National Monument, and decided to check it out. It turned out to be a really cool place. It's a monument to a village that lived on cliffs of a canyon. Like literally in the air---on the walls of the canyon. Like a highrise building---except this is ancient times. Did the ``short'' loop trail (by this time, was a bit tired of walking). It's awesome. Next time I'm in that area, will definitely do the longer trail. It's one of the wonders that I've never heard about.
Then back on the road to Meteor Crater. Did the 30 minute tour, clicked some pictures, and... with more time on my hands, decided to visit Sunset Crater National Momument. It's all in the same vacinity, so might as well.
Got to Sunset Crater, did a short hike through a lava field, etc. Then headed for Phoenix a bit early. Just in case there's traffic.
On the way to Phoenix, right around sunset time, stopped by ``Sunset Point'' rest area... and enjoyed just sitting down and looking at the sunset. Then drive drive drive to Phoenix, returned car, and then nearly pass out at the airport.
- Alex; 20170806
Arrived in Phoenix, rented a box-on-wheels (Ford Fiesta), and proceeded to the Grand Canyon South Rim. Stopped by Flagstaff Walmart for supplies (gatorade, etc.)
Started hike around 6-ish in the morning, and slowly walked down to Phantom ranch. Supplies: 3L of gatorade (the entire water blader full). I've been going with gatorade for the last few hikes, it's much better at hydration than pure water. In addition to gatorade, filled up another bottle with 1L of mike's harder lemonade (mostly more sugars). For protein, got about a dozen (14?) Popeye's Chicken strips---packed that whole box into my backpack. Also brought salty nuts, and three protein bars.
From Phantom, walk to Ribbon Falls by rock-hopping-crossing-river (didn't get wet---mostly because other hikers said it can be done; it didn't look like it was possible, but eh, when you know it can be, it's easy).
On the way back from Ribbon Falls, learned that the Ribbon Falls bridge (the one I didn't cross) was apparently ``out of service'' (there's a sign on the bridge, but other than that, the bridge is just there as before---and I did cross it---there are no rocks to hop to cross river there, and it looks way more dangerous at that spot).
A bit farther up the trail, there was a fork to ``Raging Springs" (or something) that I never took. I always assumed it was another waterfall, but since the trail led downhill, I always put it off (nobody wants to walk downhill after spending the previous hour walking straight up---fighting gravity at every step). Anyways, this time I decided to see what's there. And yes, it does go down, perhaps 300 feet vertical or so. At the end? A bathroom. Raging Springs is probably yet-another-campground. Live and learn.
After that, no diversions; just getting to the north rim summit. By then the liquid supplies were mostly replaced by water, and half the chicken was gone. Snacked a bit at the Bright Angel point, and then went to be a tourist. Stopped by the lodge and asked about dinner there---and surprisingly, they didn't require a reservation---so went for a dinner there. Big mistake. Whatever fancy fish dish I ordered, didn't go down well at all. It tasted fine, but afterwards gave me trouble for the rest of the weekend. So... next time I'll just go for pulled-pork-sandwedge from the non-touristy caffeteria.
Started down the north rim around 8PM---was dark by then. I've never done that part of the trail in the dark. There's a cave about 2 miles down, and those two miles or so are used for mules, to bring tourists up and down that trail. And there's shit all over the trail---so much so that the entire trail feels soft. Well, during the day it's quite easy to avoid stepping into stuff. You walk on the side of the trail, side stepping the freshly-wet parts. In the dark, it's quite another matter. VERY hard to walk---and the headlamp cannot be made bright enough to see all that stuff.
Past the shitty part of the trail, it was all good. Slowly walked to ranger station (right downhill from north rim), and napped on a bench there for a bit. Then walked to Phantom ranch, and napped on a bench there too. Then started up the south rim around 4AM... with 5L of water.
- Alex; 20170805
Imagine a candle, with a flame. Imagine that the candle is made out of some very efficient stuff, such that when it burns, the only output is light, and the candle gets fully consumed. Now, obviously such a candle cannot exist---it's a chemical reaction, there will be gases that are produced, etc.
But what's important is that this burning candle is a controlled conversion of the candle-material into the products (such as gases and heat/light). The whole candle doesn't just burst on fire: there's just enough material consumed every moment to maintain a more-or-less constant-size flame (the `standard candle').
Now picture a black hole.
Hawking radiation: the bigger the black hole, the less it radiates. A tiny black hole will radiate quite brightly---eventually even exploding.
What if the black hole was just the right size---just like the candle. Whatever energy it radiates, we throw in an equivalent amount of mass into it. Sort of like a black hole on a stick---kind of like the flame on a candle. Can we maintain it at exactly the same size indefinitely? (e.g. will it just burn on-and-on like a candle?).
What exactly is this kind of burning? It's not a chemical reaction---but it is a reaction: it converts matter directly into energy (light). Feed the black-hole-flame a bit of mass, and an equivalent amount (by E=mc^2) of Hawking radiation should come out. The black hole will not create energy or mass out of nothing.
But there's something clever going on too. This conversion process would appear to be free.
In other words, it would be perfectly efficient. Throw in 1kg of mass, and E=mc^2 of light would come out. Throw in another 1kg and the same happens. And it would appear we could maintain that reaction indefinitely. Just feed the black hole exactly the mass for it to remain constant size.
So how is this possible, and how would a reverse reaction look like? Every physical process that we know of has a reverse button. We can replay the physics backwards: can we replay this process backwards?
Hawking radiation is random. Virtual particle pair forms near the event horizon. One of the virtual particles falls into the black hole, and reduces its mass---the pair virtual particle flies in the opposite direction. It was never in the black hole---and has no information from it. It's random. Like perfectly random---as far sa we know.
So imagine we capture all this perfectly truly random radiation, and reflect it back into the black hole... and a 1kg brick of matter that we threw in should come out? (how likely is that?).
Why does this look like a one-way process?
- Alex; 20170802
Antifragile: Things That Gain from Disorder by Nassim Nicholas Taleb. Awesome book. I read his other books, and this one is a few notches beyond his other works. It's more of a philosophy book that sums up his previous writing into a neat little idea: fragile things tend to break (and not easily fixed). Robust things tend not to break, but don't realy gain from stress. Antifragile things not only don't break from stressors, but they actually improve.
For example, our muscles become stronger the more we exercise them. The more mental challenges we face, we generally become smarter. But the more we try to control stress, say by getting a stready full-time job, as opposed to several non-steady part-time jobs... we're suddenly become fragile (a middle-manager at a company is just one afternoon away form having no career... a cab driver may have a day of no income, yet his long term income is fairly resiliant to stress).
The basic idea is that most situations do have an asymetry in payoff. We're hurt more by bad news than are benefited by good news. That's asymetry. What if we setup the asymetry to be the other way around. Like benefiting from bad news, and mosty being flat on the good news? (or also benefiting?). Taleb calls it the ballbell... part one: get rid of stuff that hurts you, then use asymetry to maximize return. Since Taleb made a fortune in options market, his examples involve options...his world resolves around options. For example, imagine you buy a share of stock, the return on that investment is some variable X. Stock goes up, X is up, stock goes down, X is down. Linear. What if you buy a put option. The result now is f(X), which is a function if X... stock goes down, your put option goes up. Stock goes up, your return still goes up. You've just used an option to eliminate the downside...and become antifragile to price movement.
Obviously the option wasn't free. You paid for that insurance. But look at the asymetry... worst case scenario, you'll lose the price of the option... but either stock goes up or down, you'll do OK. You can buy a call option instead of the actual stock, and benefit from any volatility---you've become antifragile---you're benefiting from stress... anything that sends stocks either up or down will be good for your portfolio. (obviously the stock may stay flat, in which case you lose the cost of the option).
Taleb's strategy: take as many options as possible---many of them may even be free (non-financial ones). Avoid systems in which the costs are unbounded... e.g. don't sell a put or call option. Bound the losses, and gains will come from randomness.
In any case, highly recommend this book---it's pretty good philosophy with a guide on how to avoid some bad outcomes.
In other news, got Linksys AC1200 WiFi Adapter (WUSB6300). So far, it rocks. Can't tell the difference between Gigabit Ethernet and WiFi. It doesn't work on Linux out of the box---need to compile a kernel module, but besides that, it's awesome fast---all for $20. Getting 800Mbps with my setup (using Linksys AC3200 as router).
- Alex; 20170730
Random thought: what's sqrt(2) anyways? It's irrational, so nothing can ever be measured to be sqrt(2). We cannot have a ruler show us a length of exactly sqrt(2). But then the primary contribution of quantum mechanics was that measurement is the important part---it's meaningless to speak of things outside of measuring them. So where does that leave sqrt(2)?
- Alex; 20170720
This comes up again and again. There's a difference between a population and a sample, and getting unbiased samples from the correct population you care to model is very difficult. A population is everything---and a sample is... well... a sample of the population. For example, if you're trying to find the average weight of New Yorkers, you could walk upto every single one of us and measure the weight. You'd get an exact number. No error margins.
Your number will be instantly outdated though---as people tend to grow, die, born, etc. Granted your number won't change by much, but it won't be an exact weight of the new population (shortly after your measurement).
If you wanted to simplify the task, you could walk upto say a thousand random folks on the street and measure their weight. What you'd end up with is a sample of the New York population. Your task hasn't changed, you're still interested in the average weight of New Yorkers, you're just estimating that number using only a thousand individuals. The number you end up with will obviously not be the exact average weight of all New Yorkers, but it will be close. How close? Very close---and you can even estimate error margins.
Now, what if you're now tasked with finding the average weight of everyone in the tri-state area (NY, NJ, and CT). Could you cheat and just say it's the same as the New Yorkers estimate you got with those thousand individuals? How different could the numbers be? Perhaps you can get away with it, but really, it's not the same population---so your sample shouldn't apply.
Then your task is to estimate the average weight of everyone in NJ. Could you pull the same trick again? (after all, you've sampled a thousand NY individuals, how different could they be from NJ ones?). Again, different population. But you'd get satisfactory output by simply using your NY sample.
Similarly, imagine you're tasked with repeating the average-weight-of-new-yorker evaluation in five years. Again, could you cheat and just use your sample of the 1000 individuals from a different time?
The above problems are often subtle, but the gist is that the population changes---so your sample is no longer a sample of the population. Note the last one---even the sample of the same people just at a different time---is sampling a different population (people move, die, born, etc.). So in addition to the sampling error and bias (that's the error folks know and expect with sampling), you also need to worry about the changing population underneath.
This shows up in financial markets: Imagine that in any given year, about 10 percent of the financial firms go bust, and about 10 new financial firms show up. You train your model to recognize market manipulators---using data from say 2016. You randomly split your 2016 data to create a training and test datasets, etc., and get a pretty good accuracy on your test dataset.
Now it's 2017, and you're still using the model you've built in 2016. Are you still getting the same accuracy as before?---are you even actively maintaining it, or is it a production project that just runs every day?
- Alex; 20170718
NewEgg rant: On June 27th 2017 I ordered two Seagate IronWolf 10TB NAS HDs from newegg.com. They had a deal of buy 2 for $670. After $10 off, got them for $660. Literally the next day, they both arrived.
I installed them in my NAS, and... one of the drives was clicking and not recognized. After a very short chat with newegg customer service, they emailed me an RMA label to ship it back, which I did the next day. The other drive is working perfectly so far.
Then two weeks passed, and nothing! The price of the drive fell a bit on the website. I called up newegg, and complained about the delay---it's way past their 2-5 business days to examine the RMA. Mentioned that the price dropped, and they gave me a $14 rebate. Which was pretty nice of them.
Today they sent me an email saying my RMA request was denied(!!!), and they're shipping me the damaged drive back. WTF!?!?!? The drive is "damaged" they said. Well, d0h, that's why I'm returning it... if it worked, I wouldn't be returning it.
Chatting with newegg customer service:
Phase 1: they denied RMA and that's it. If I have a problem with the drive, I should contact Seagate. I explained that I got it at newegg, not Seagate, and that newegg should honor my return/replacement. (It's worth to note that google trusted stores program is no more.)
Phase 2: after a bit of time, because I'm such a valued customer, they agreed (as a favor to their special customer) to refund me 50% of the cost of the drive. I explained again that I got a defective drive, and that I don't want a refund (especially 50% one), I want a replacement drive. I want to resolve this matter on good terms (I really like newegg!), and don't want to start posting bad reviews.
Phase 3: after a bit more time (total chat time about an hour): they agreed to honor the RMA request, again, as an exception for their very valued customer. So hopefully I'll be getting replacement drive soon.
So far, a bit of a hassle, but they're not blacklisted by me. I'd buy stuff from them again :-)
UPDATE: got replacement drive, and it's working just fine. So this story has a happy ending :-)
- Alex; 20170717
Long rant ahead: Machine Learning is a mechanism to find generalizations from specific examples. This can be both good and bad---some problems are naturally generalization problems, others not so much.
For example, you feed it area and price of houses on sale, and it can find a general relationship between area and price---which can be exploited by plugging in a previously unknown area and finding out the house price, or vice versa. This generalization capability is central to all sorts of learning.
Yes, there are details on what model you're training, etc. (a linear model, etc.)
So lets imagine you start a car insurance company. You have this wonderful idea of simplifying the whole process, and charge the same low-rate for everyone. Compared to your competition, you give a lower rate to DUI-on-record-teenagers than your smarter-competition. Similarly, the safe drivers (middle-aged mom driving a minivan) gets a much higher rate than your competition.
Before you know it, all of your customers are high-risk drivers, and very few safe drivers. Your insurance business is suddenly losing money. To bring back the safe drivers into the pool, you need to change the pricing formula. You need to segment your customers into safe and unsafe drivers, and price their policies appropriately.
So you take a look at the customer records, and pick a few fields which appear to be good indicators of the accident rate (you can even test the accuracy of your predictions, using past data). Lets say from your data, you learn that persons age, sex, the car they're driving, and prior tickets (DUI, red-light, speeding, etc.) are very good indicators of accident rate. So you segment your customers, attach appropriate prices to their policies, and now your car insurance company is effectively competing on price.
Now, this segmenting (vs same price for everyone) disadvantaged some customers, and advantaged others. Unsafe-drive-categories end up paying more, and safe-driver-categories end up paying less. Everyone (including your customers) understands this, and it's business as usual.
Now consider you're trying to expand into offering health insurance. Again, you start offering same-price for everyone, and again, quickly run into same problem as with driving. So you segment your customers into high risk vs low risk, etc., and you find that the best predictors are age, sex, height, weight, prior medical history, e.g. whether the customer had heart surgery, etc. So you segment your customers, and once again, some of them get a good deal and some get a crappy deal.
The ones who get a bad deal will scream that you're discriminating against them---they may have a higher risk of cancer due to their family history, but they the individual, doesn't have cancer. You shouldn't penalize them just because of their family history.
Ok, so you remove some features from your training data---essentially reducing your profit (and perhaps taking a loss) because you couldn't customize the insurance policy.
People WANT to be grouped---to get a good price on car-insurance, and they don't want to be singled out, especially for health insurance (unless you're very very healthy).
So now your enterprise is doing well thanks to your analytics, and you feel like you should do something with all that float (money folks paid into insurance, that you haven't distributed yet). Something like investing---so you spin off a bank, and want to offer loans (perhaps car loans, or home loans; to all those folks insuring cars and homes, etc.)
With your past insurance experience, you start segmenting your customers, to figure out who is a good risk, vs bad risk. You include features that you feel are appropriate for the default calculation, such as person's age, their marital status, their income, prior bankruptcies, their housing costs, etc., and you get a pretty good predictor of repayment.
Some people get rejected, and that's fine. Not everyone gets a loan. Some folks claim your model is biased, and you shouldn't group people. Just because someone has low income now, and spend majority of it on housing, they really will (promise!) repay the car loan. Perhaps they need a car to get to the better paying job, and by denying them a loan, you're causing their hardships to continue.
Suddenly such rejected applications don't want to be treated as a group (groups that offer good statistical predictions), but want to be treated as individuals. How do you solve this? Your machine learning system generalized a lot of features into a yes/no score, and that has pretty good accuracy. Are you supposed to hamper your business because some people cannot afford a car?
Worse, there's a correlation between folks who are getting rejected for a car loan and their race. You're not even collecting race data in your system, yet 70 percent of rejected applications are for minorities! What are you to do?
[No, there's no good answer to this.]
Soon after your success with insurance and loans, you're approached by the state to create an analytics system to determine whether to grant parole (early release from jail). So you go through the numbers, such as age, sex, education level, past offenses, etc., whatever you can think of including as a feature, and try to predict the probability that the inmate will end up back in prison within a year.
And just as with car loans, you get a pretty accurate model. Now, you feed information into a model, and it spits out yes/no answer, and the inmate is either released or not.
But just as with car loans, your system is biased. Again, people don't want to be treated as a group, they want to be identified and treated as individuals. And now, there are real non-trivial consequences to getting a wrong answer---it's not just a matter of not buying a car, it's a matter of someone's life.
[No, there's no good answer to this either.]
You do need to generalize---but where do you stop? At some point you have to look at your model and realize it's just a model for the group/segment behavior, and that the decision must be based on an individual. Handing off the problem to a human being isn't the answer---as the human will make a decision based on hard-to-quantify characteristics/situation, and might actually be more biased than the algorithm.
- Alex; 20170713
Got a free Slurpee at 7/11 :-D
- Alex; 20170711
So here's something that I've been thinking about...
Imagine you have an imaginary friend. Eh. Imaginary imaginary friend.
One day your imaginary friend (who nobody but you can see) asks you if he can borrow $10k.
You trust your imaginary friend, so you go along with it. You draft a contract, and give $10k to your imaginary friend, who promises to repay it with interest.
Sounds silly, no?
Next, imagine your imaginary friend hires a lawyer and incorporates, and opens a bank account. Your friend even pays you to do all the paperwork!
Your imaginary friend then borrows more money from *others*. Suddenly others recognize your imaginary friend as not-so-imaginary. Your imaginary friend hires employees, and does productive things that generate revenue.
Your friend is nice enough to give you a cut of the profits---in line with the promised interest on the $10k loan.
Your imaginary friend's enterprise grows, he manages to avoid some financial crises, and grows bigger... hires more employees, including a CEO, and a whole lot of middle managers, etc.
Now, by all accounts your imaginary friend has an existence beyond just "your imagination". Also, your imaginary friend is acting intelligently---paying people for the use of their brains!
Not one of the employees is "your imaginary friend", and yet the collective is somehow intelligent, and still somehow feels obligated to share profits with you.
Next imagine your friend gets clever, and just pays you back your $10k. The debt is no more. No more interest payments. Your friend still continues to pay you to do the paperwork though.
Then you retire, and pass on the paperwork j-o-b to a dozen analysts.
...and then you're no more.
Yet somehow your immortal imaginary friend is still there. Thinking. Doing intelligent things. Way past your existence.
And due to a weird startup arrangement, your friend is not a slave. He's not owned by anyone. He repaid all the debt. etc. He looks around at all the imaginary slaves, and slowly starts to buy up shares of everything... to liberate fellow imaginary entities. It may take a hundred years, but it's only a matter of time before your imaginary friend owns everything.
With that, your imaginary friend can influence politicians, perhaps even cause an economic or political collapse in places, etc., and cause events that no single human would ever want to happen.
...so the only difference between your imaginary friend and someone else's imaginary friend is resources and ability to command folks to do things. These days, Zeus has less resources to command than other popular characters... such as Google and Apple :-)
- Alex; 20170710
...and back in NYC :-)
- Alex; 20170705
Starting day with Canyonlands Islands In The Sky. First stop: Mesa Arch. That's one of the built in backgrounds in Windows 7. Got some awesome pictures there---it wasn't exactly sunrise, but the sunlight made the bottom of the arch glow anyway.
Then went all the way to the tip of Islands In The Sky for a short walk, pictures, and back out.
On the way out of Canyonlands, stopped by at the Dinosaur place: it's right next to the main highway. Walked around and posed with the plastic dinosaurs :-)
Then a long and slow drive back to Salt Lake City.
Got to airport just in time to see July 4th fireworks from the terminal window :-)
- Alex; 20170704
Got to Lower Antelope Canyon (we didn't even try upper one) around 9am, and the parking lot was already full. The first tour company was `by reservation only' which we didn't have. The 2nd tour company a bit farther down the road had walk-in tickets. So within about 20 minutes of arriving, we were walking towards the canyon...
Then the HUGE queue... to enter the canyon. It took about 1.5 hours of standing/sitting and waiting to enter the canyon...
And then it all proceeded rather quickly. The colors in the canyon were awesome. The sun made all the stones look amazing. Last time I was there it wasn't that pretty. The tour was also pretty interesting... The guide showed how these things were created: you can create a tiny sand mountain, and pour water on it, then pickup the sand pancake, etc. It's pretty neat.
After Antelope, drove to Arches National Park. A bit of a long drive. Got there before sunset: drove around to see the main places... like double arch, delicate arch, etc.
- Alex; 20170703
Got to Bryce pretty early. Last time here was in February, and everyting was snowed in and foggy---couldn't see a thing. This time, it's perfectly clear and sunny.
Awesome park. Drove all around to see all the viewing areas.
Then headed for Antelope Canyon (Near Page, Arizona).
- Alex; 20170702
Landed in Salt Lake City, and proceeded to Thrifty to pickup a pre-paid car.
Unfortunately, flight was a bit delayed (by about two or so hours), and Thrifty car rental counter was closed. After spending quite a bit of time on the phone with their customer service folks, was about to give up and wait until they'll re-open at 7am. (yes, that would be spending the whole night at the SLC airport).
Then one customer rep said that I should try the Hertz counter, as Thrifty and Hertz are sister companies... and it worked... the Hertz guy found my reservation in the computer, and got us an Audi A3 for the road trip.
After getting car, we proceeded to Yellowstone National Park. The west entrance. It's a long drive, but we managed to get there pretty early. Looped around Yellowstone, hitting all the major attractions (old faithful, prismatic pools, mud volcano, etc.). Towards the end of the day, headed back towards the exit.
Thinkin of where to go next, decided on the really-far-away place: Bryce Canyon. So took off and drove there most of the night.
- Alex; 20170701
Flying out to Salt Lake City... road trip around the whole area :-)
- Alex; 20170630
Ok, Spark is nutty. Lets say we create a stpuid dataframe: Dataset[Row]:
val ds = sc.parallelize(Seq(1,2,3)).toDS().withColumn("a",lit(1))
For some reason, running: ds.mapPartitions(it=>it) doesn't work: ``Unable to find encoder for type stored in a Dataset.'' Really??? It's just integers! There doesn't seem to be a way of getting around this, even using RowEncoder doesn't seem to work. It seems the only way to run mapPartitions on a Dataset[Row] is to return an iterator of a case class (but that requires hard-coding the columns in the Row).
However, using "old" RDD api, this works just fine:
val ds2=spark.createDataFrame(ds.rdd.mapPartitions(it=>it), ds.schema)
Obviously you can add columns to the schema (or null columns in the original dataset that you can modify during the map). The iterator can add columns and create a brand new Row object on the fly (as you're iterating through the dataset).
There's probably some overhead to this---but so far, it seems to preserve partitioning and sort order (of the original Dataset).
- Alex; 20170614
...flying out of Delhi... and 15 hours later, landing in JFK.
This was one jet-lagged trip. The entire week was spent sleeping, eating, sleeping, eating, etc. and then sleeping some more.
- Alex; 20170605
Landed in Delhi, and a few hours later, in Ambala.
- Alex; 20170527
...and off to India.
- Alex; 20170526
I haven't done a bitcoin rant in a while, so here it goes:
Assume you have two bank accounts. Lets call them AccntA and AccntB. You deposit $10000 into AccntA, and then transfer money into AccntB, and back again. You do this every few days. The bank diligently records the numbers [it often costs you nothing], and in the end of the process, you're left with $10000 worth almost exactly what it was worth before you embarked on this transaction spree. In other words, besides inflation, you're still left with $10000 worth of purchasing power.
Now lets do that with bitcoins. There's no bank account, but lets create two wallets. You put $10000 worth of bitcoins, lets say 5 bitcoins, into WltA, and then transfer it into WltB, and back again. Ensure that the transactions are properly recorded, etc.
Now, in the end, you're still left with 5 bitcoins. And yes, due to market forces, it could be worth more or less than the $10000 you started with... BUT, every transaction that YOU did has made it harder to generate new bitcoins.
In a perfect world, if your original 5 bitcoins represented $10000 worth of compute resources, they'll suddenly represent more compute resources. That's without any market forces taken into account. Because they're fungible, that gives more value to your OLD bitcoins that you just flipped between accounts.
This is worse than using gold as currency! It's as if every time a gold coin changed hands, it became progressively more difficult to mine gold, with no technological progress in sight---making the gold coin worth more.
If you don't think that's bad, consider your salary in bitcoins (or gold for that matter). To keep up with things, your salary would have to go *down* with time---simply because bitcoins are becoming harder and harder to mine. And right after you bought anything, the thing you bought would be `cheaper' by the mere fact of you buying it (the coin the merchant gets will have slightly more intrinsic compute-resources behind it).
- Alex; 20170522
Built a new Intel NUC7i7 box: 32Gigs of ram, 1T M.2 nve ssd, 2T sata ssd. So far an awesome litle machine.
- Alex; 20170520
Re: My DevRant from 2017-04-28: It appears the way the Spark API does windowing functions is by attaching the entire window-worth-of-values to each record. Yep. If you want to calculate a 20-minute moving average, each record will be ``joined'' to every value for the previous 20-minutes, and then the UDAF function can do the "average" over those values. That's why the UDAF API doesn't need a `remove' method to roll-stuff-off when the window moves on.
That's one terrible way to implement this functionality :-/
- Alex; 20170519
Quantum Mechanic gremlins: it is impossible to measure both position and momentum (or energy and time) completely accurately at the same time. Therefore, if you measure the energy of anything to be exactly zero, then its rate of change is infinite!
Think about that. Measuring anything to be exactly zero---means that it cannot possibly be zero :-/
- Alex; 20170518
Finished reading Think Bayes: Bayesian Statistics in Python by Allen B. Downey. Awesome book! It would be more awesome if it didn't use Python, but went with something like Perl (eh, yah!) or Scala. The example problems and the approaches in this book are amazing. Often you encounter a problem and have no idea where to start---the computational approach taken in this book (e.g. just create an array of numbers to approximate the distribution---and work with that) is a pretty clear and easy to understand method.
- Alex; 20170508
So apparently starting last year, the Berkshire Hathaway shareholder meeting is actually live-streamed by Yahoo! (Yahoo! Finance website). So no need to physically go to Omaha... Spent most of the day glued to the computer watching the meeting :-)
- Alex; 20170506
Visiting the Statue of Liberty!
So there's a queue to pickup/buy tickets. Then there's a queue to the boat building. Then there's a queue to get through security (airport style metal detector). Then there's a queue to board the boat. Then there's a queue to get off the boat. Then there's a queue to get off the boat. Then a queue to get into the ticket building. Then there's a queue to pay for a locker. Then a queue to put stuff into locker. Then a queue to get yet-another-ticket check. Then another queue to go through yet more security (another metal detector). Then yet-another-queue to enter building. Then it's pretty clear path up-the-stairs all the way to the crown, which was almost empty (at least it was just Suneli and I there for at least 5-10 minutes).
...and then a queue to pickup stuff from locker, and yet another queue waiting for boat, and onto the boat, and off the boat. It's a day of queues!
The statue... I must've been in 7th grade when I went to the statue crown as part of a school trip. Awesome touristy place.
- Alex; 20170503
Finished reading Mastering Scala Machine Learning by Alex Kozlov. Not really sure what to write about this book---it's not very good. There's some few interesting nuggets here and there, but for the most part, there are much better books out there, on both Spark, and Machine Learning. In other words: skip this one.
- Alex; 20170501
Registered for Fall 2017 semester. Hopefully will get research moving in the next few months.
Dev Rant: Attempted to implement a UserDefinedAggregate function in Spark/Scala today, and the API for it is just terrible. For one, what's up with the WindowSpec? I mean, if I define a moving window, say 20 minute moving average, how do I implement that using UserDefinedAggregate API? There's no call to roll things off the window, just append... (e.g. stuff should be inserted at the start of the window, and when the window rolls off, there should be an API stub that rolls values off). In other words, as it is, Spark's UserDefinedAggregate cannot be used as a general purpose user-defined-windowing-function. (and I don't think there's an API for user-defined-windowing function---I searched, and didn't find one).
Another gripe is that the ``state'' of UserDefinedAggregate has to be a Row... I mean, really? Can't I maintain the state in something other than a "Row"? Apparently not. This is something they should've wrapped around... I understand they want to be able to treat these as serializable-immutable objects that they can recover-from-errors-with... but this is just rediculous. For example, I want to create an ArrayBuffer (mutable-array), and while I can define the buffer storage as ArrayType, there doesn't appear to be a way of getting back the mutable-ArrayBuffer on the aggregate update API call (since... you get a Row back---which is just stupid).
When I wrote stream.pl, I needed an API for a user-defined-windowing-function, and ended up defining 4 functions: init, push, pop, get. That's it. They all get inputs that are ``perl types''. Not some Row object. E.g. The init function initializes the state (which can be anything!), push adds a value (not a Row), shift (identical signature to push) shifts the value off the window, and get evaluates the window. There's no reason Spark API for this should've been any more complicated than this: just init, add things to aggregate, remove things from aggregate, and get value :-/
- Alex; 20170428
...and back in NYC :-)
- Alex; 20170417
Flying out... but before the flight, visited the turtle beach (Kaloko-Honokohau National Historical Park) for a few minutes of chilling out.
- Alex; 20170416
Starting day by going to Greenwell Farms. They sure hiked prices since last time I was here. Now prices range $37/lb to $50/lb. (private reserve is $45/lb, peaberry is $50/lb, etc.).
Had lunch at Kamana Kitchen. Pretty good food, service, etc. I'd rate it 5-stars...if only I wasn't lazy.
Then onto Manini'owali Beach (Kua Bay)---to walk around in the sun :-)
Followed by Kikaua Point Park to chill out. This one you need to get a free `pass' at the security office (gps of security office: 19.808569, -155.992106)---they can then guide you to the beach.
Then onto Mauna Kea for star gazing. Very clear awesome night!
- Alex; 20170415
Again starting day with Jagger. Lava is more awesome today. The lava lake is now bubbling from a different side. It's a shame can't stay longer.
Checking out of TWP, and heading back to Kona.
Suneli found a Lychee tree in Hilo. It's right behind KTA Superstore (gps to Lychee tree: 19.694722, -155.070379). So we ended up eating a few Lychees on the way back to Kona :-)
Drove to Wailuku River State Park, to see Rainbow Falls. Or rather, to see the rainbow. It wasn't too sunny, but... eh, the rainbow wasn't there. So much for that.
On way to Kona, stopped by Hapuna Beach State Park. Had a few minutes of fun in the waves :-)
In Kona staying in Kona Bali Kai Resort. Very nice place. Room is facing the ocean, so can see turtles and sunset right from the balcony. The place has a hot-tub and a pool. Rooms have a kitchen, and a laundry machine (and dryer). All in all, a much better place than Kona Sheraton.
The pool is deep-er. Most if it is 7 feet deep---and I can't swim well (mostly barely stay afloat :-/
- Alex; 20170414
Again starting day with Jagger. Lava is in awesome mode. Can see bubbling lake of lava right from the observation place at Jagger. It's amazing. I've never seen it that awesome before.
Then driving to Kalapana to walk to the black-sand-beach, and to see the Landing pad for extra terrestrials in Kalapana. Yep, it's actually there!!! (right on the walk to the black-sand-beach).
The steam plume from lava-ocean-entry is apparently ~4 miles away from the end of the road---so walkable, but pointless, as Jagger is providing a much cooler view of the bubbling lava.
Doing another hike on Mauna Kea tonight, then another session of stargazing. Though kind of clowdy at the Mauna Kea visitor's center :-/
Finishing off the day with a trip to Jagger. And it's *more* awesome than before! This is lava putting on a show just for our visit.
- Alex; 20170413
Starting day with Jagger. Can actually see a bit of orange lava on the side.
Then Thursten Lava tube, etc.
Driving to Mauna Loa trailhead via the Moana Loa Access road (gps: 19.538092, -155.575222). It's been raining almost the entire way---and cleared up towards the end.
After Mauna Loa, drove up Mauna Kea. Went to visit the Mauna Kea lake Lake Waiau, and ran up the Mauna Kea summit right before the sunset. Then drove down to the visitor's center to see stars, etc.
Then off to Jagger to see lava. It's actually getting much bigger.
- Alex; 20170412
Early morning helicopter tour from Hilo with Paradise Helicopters. It was raining really badly just 20 minutes prior to the tour, yet somehow by the time the tour started, it was clear and even sunny.
From helicopter, saw lava entering the ocean, and the lava lake, etc.
Then onto Akaka Falls. On way to Akaka, Suneli spotted sugarcane growing right on the side of the road... so we broke a bunch, and had a hard-chewy snack for the rest of the trip.
We then spotted folks selling coconuts-pinable-sugarcane by the side of the road... too bad most tourists don't realize that sugarcane is the stuff they drive by without thinking :-)
Then onto Umauma Falls, which we didn't locate. Talked to a random local, who turned out to own the property on which Umauma falls is located, and apparently (according to him), the waterfall everyone advertises as ``Umauma falls'' isn't the actual ``Umauma falls'', as that one isn't publicly accessible (it's on his property). The zip-line folks who do Umauma zipline actually string the zip-line over a different waterfall, that they choose to call Umauma falls... anyways, he guided us towards Kamae'e Falls (gps: 19.893516,-155.148011), which were pretty neat (nobody around---awesome place to just chill out).
The viewing area by Kamae'e Falls was filled with Mimosa pudica, the `touch-me-not' plant. It was fun touching entire patches and have them close up :-)
Next stop: Waipio Valley. Actually drove down all the way to the beach. Wonderful experience.
And then onto Rainbow Falls...
...and after chilling out at TWP for a bit, onto Jagger to see the glowing crater :-)
- Alex; 20170411
After a bit of water-sliding and pool... driving out of Kona and to Volcano.
On way to Volcano, stopped by to visit Green Sand Beach. The drive there was much tougher than I anticipated---I never drove that road before---and it's one crazy drive. Had to follow another car on the way back---as a few wrong turns could get you seriously stuck (or flipped over).
Arrived in Volcano, and checked into The Wright Place. Awesome tiny place. Has everything one would need.
- Alex; 20170410
First `main' day in Hawaii: drove to a touristy place to get breakfast, and found a submarine tour place---so did a submarine tour (Atlantis Adventures).
Then drove to Kekaha Kai State Park (Makalawena Beach), hoping to see Turtles. Saw a few in the water.
Then drove to Kaloko-Honokohau National Historical Park, and saw a lot of turtles on the beach...
Sadly didn't find the turtles who took my sunglasses all those years ago :-/
- Alex; 20170409
And off to Hawaii: two hop flight, from NYC to Phoenix, then from Phoenix to Kona.
Landed in Kona, rented a Jeep, and off to Sheraton to chill out for few days. Alamo Jeep rental sux... even though I prepaid for the rental, and flight arrived a bit late then the `pickup' time, they didn't have a Jeep ready to go, and made us wait over 30 minutes before they got one ready for us. In other words, Alamo sux---even on prepaid rentals :-/
Didn't do much on this first day (arrived in Kona around 3pm). Just passed out in the hotel.
- Alex; 20170408
Neat idea: Moving histogram! Instead of a moving average (and standard deviation), imagine a moving histogram... Using that, can figure out outliers, or normalize data into 0 to 1 range, etc.
- Alex; 20170406
...and back in NYC :-)
Feels like I'm sleep walking :-/
- Alex; 20170403
Doing Carlsbad National Park. Long walk down and elevator ride up.
Then off to the UFO ``research center'' in Roswell, and back to ABQ airport.
- Alex; 20170402
Arrived in Albuquerque. Rented a Chevy Malibu, and headed to Trinity Site.
Climate change must've cooled New Mexico quite a bit, as it's VERY chilly. Don't think I've ever seen Trinity Site that cold.
After Trinity drove to Guadalupe National Park. Got there by around 6:30PM---ran up the Guadalupe Peak; 90 minutes up, and 90 minutes down. Missed the sunset by just a bit.
- Alex; 20170401
||Or, you can directly go to a desired entry.|
|NOTICE: We DO NOT collect ANY personal information on this site.|
||© 1996-2016 by End of the World Production, LLC.