Blog Archives

Security for Hadoop and big data analytics

As Hadoop analytics projects move to production environments, companies need appropriate user and data security across both Hadoop and user applications.  Jon “Natty” Natkins from Cloudera wrote a good article yesterday on Authorization and Authentication In Hadoop.

Yesterday, Datameer announced the release of Datameer 1.4.  One of the updates is tightening security on data layer and the user layer.  As Natty brought this up in his post, “one of the more confusing topics in Hadoop is how authorization and authentication work in the system”.  Datameer customers have the same concerns as they expand users from a handful of data analysts to hundreds of business analysts.

Datameer manages both Hadoop and end-user security and provides additional functionality for authentication and authorization.  Below is a quick overview of Datameer security features.  You can read the details of the content in the datasheet Datameer Security (PDF).

Authentication
- Datameer provides LDAP / Active Directory (AD) authentication and integration
- Datameer supports connectivity to LDAPS (LDAP over SSL)
- Datameer supports operating in Hadoop environments secured by Kerberos

Authorization
- Datameer provides role-based access with delegation

For more information, here are some additional resources.
- Datasheet: Datameer Security
– Datameer Documentation
- Free Trial of Datameer

If you have any questions, contact us.

Posted in Uncategorized | 1 Comment

Will you beat Santa to Christmas morning?

It’s that time of the year again when travelers drag themselves to the airport to get on a plane and hopefully meet their loved ones at their destination in a reasonable amount of time.

Thanks to publicly available FAA data, Datameer ran some analytics and found these interesting insights from past Christmas holidays. The data comes to us from the Research and Innovative Technology Administration (RITA) and has detail on everything from individual airports to the specific reasons why delays might happen.

Using Datameer, we generated a spline graph of the average number of incidents over an 8-year period. If you’re traveling on the 21st or 22nd (and to no one’s surprise), anticipate a sea of people at your airport, and possibly delays – these are the peak travel days. Surprisingly, if you’re traveling on the 24th or 25th, most people will have already completed their travels and your journey might be a little less stressful.

So what does this look like when we look at individual airports and destinations?  Using a circular graph chart, we can see the most delayed routes by the frequency of delays below. We set this up in Datameer by first mapping out the travel routes in the worksheet, grouping them and sorting by the highest count in descending order.

We can see from the diagram and illustrated by the thickness of the line above, routes between Chicago O’Hare, Newark, and LaGuardia were most heavily delayed. We know, of course, that Chicago is the largest hub in the US so there’s no surprise there. But what IS surprising is that if we look at the top number of incidents due to weather, Dallas took the top spot in number of incidents in December beating out Chicago and Atlanta, as shown in the diagram below.  So if you’re flying from or through Dallas, you might want to pack some extra reading materials.

And if you’re curious as to which travelers have spent the most time waiting in December, San Francisco travelers get that unfortunate title. Below we’ve taken the total number of minutes spent waiting to arrive at a particular airport by summing up the delayed-minutes field by destination.

Comparing this graph to the one above, we could deduce that SFO has lengthier delays and DFW has shorter but more frequent ones and the data confirms that this is the case.

So, while we don’t expect you can alter your travel plans, we hope you find these insights informative.  There is a wealth of publicly available data like this FAA dataset that can provide valuable insights.  And, Datameer offers a free trial of our solution here that you can use to analyze and explore this and virtually any other data.

Safe travels everyone. Happy Holidays from Datameer.

Posted in Big Data Analytics Perspectives | Tagged , , , | Leave a comment