Quick note about Hive and Presto

I am working to build application using Treasuredata. Treasuredata provide Hive or Presto to execute jobs. So here are my mote for those two.

Hive is a program to manage big data, built on top of Hadoop.

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query, and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like queries (HiveQL) into the underlying Java without the need to implement queries in the low-level Java API. Since most data warehousing applications work with SQL-based querying languages, Hive aids portability of SQL-based applications to Hadoop.


https://en.wikipedia.org/wiki/Apache_Hive

How Hive works

Hive translates SQL queries into multiple stages of MapReduce and it is powerful enough to handle huge numbers of jobs (Although as Arun C Murthy pointed out, modern Hive runs on Tez whose computational model is similar to Spark’s). MapReduce is fault-tolerant since it stores the intermediate results into disks and enables batch-style data processing. Many of our customers issue thousands of Hive queries to our service on a daily basis. A key advantage of Hive over newer SQL-on-Hadoop engines is robustness: Other engines like Cloudera’s Impala and Presto require careful optimizations when two large tables (100M rows and above) are joined. Hive can join tables with billions of rows with ease and should the jobs fail it retries automatically. Furthermore, Hive itself is becoming faster as a result of the Hortonworks Stinger initiative.

How Presto Works

Presto is a SQL engine, built on top of Hadoop.

In some instances simply processing SQL queries is not enough—it is necessary to process queries as quickly as possible so that data scientists and analysts can use Treasure Data for quickly gaining insights from their data collections. For these instances Treasure Data offers the Presto query engine. Presto is an in-memory distributed SQL query engine developed by Facebook that has been open-sourced since November 2013.Presto has been adopted at Treasure Data for its usability and performance.

Presto versus Hive: What You Need to Know


Use Hive for batch – routine jobs. Use Presto to fetch smaller (simpler) data.

Active Directory

Active Directory (AD) is a directory service that Microsoft developed for Windows domain networks. It is included in most Windows Server operating systems as a set of processes and services. Initially, Active Directory was only in charge of centralized domain management. Starting with Windows Server 2008, however, Active Directory became an umbrella title for a broad range of directory-based identity-related services.


https://en.wikipedia.org/wiki/Active_Directory

Active Directory is a database based system that provides authentication, directory, policy, and other services in a Windows environment

LDAP (Lightweight Directory Access Protocol) is an application protocol for querying and modifying items in directory service providers like Active Directory, which supports a form of LDAP.


https://stackoverflow.com/questions/663402/what-are-the-differences-between-ldap-and-active-directory

Active Directory Domain Services is Microsoft’s Directory Server. It provides authentication and authorization mechanisms as well as a framework within which other related services can be deployed (AD Certificate Services, AD Federated Services, etc). It is an LDAP compliant database that contains objects. The most commonly used objects are users, computers, and groups. These objects can be organized into organizational units (OUs) by any number of logical or business needs. Group Policy Objects (GPOs) can then be linked to OUs to centralize the settings for various users or computers across an organization.

When people say “Active Directory” they typically are referring to “Active Directory Domain Services.” It is important to note that there are other Active Directory roles/products such as Certificate Services, Federation Services, Lightweight Directory Services, Rights Management Services, etc. This answer refers specifically to Active Directory Domain Services.


https://serverfault.com/questions/402580/what-is-active-directory-domain-services-and-how-does-it-work

Project Management

My note on: what is project and how should we proceed and close project successfully.

 

A project is temporary in that it has a defined beginning and end in time, and therefore defined scope and resources.

And a project is unique in that it is not a routine operation, but a specific set of operations designed to accomplish a singular goal. So a project team often includes people who don’t usually work together – sometimes from different organizations and across multiple geographies.

The development of software for an improved business process, the construction of a building or bridge, the relief effort after a natural disaster, the expansion of sales into a new geographic market — all are projects.

And all must be expertly managed to deliver the on-time, on-budget results, learning and integration that organizations need.

From : https://www.pmi.org/about/learn-about-pmi/what-is-project-management

 

5 steps in project management.

  1. Initiating
  2. Planning
  3. Executing
  4. Monitoring and Controlling
  5. Closing

Initiating

  • Call on right people with clear goal.

Define

  • Scope
  • Time
  • Quality
  • Communication method
  • What needs to be bought, subscribe order

Planning

  • Write all to do (Gannt Chart), Feature, Story, Tasks
  • Risk management

Executing

  • Just do it
  • Are assignments OK?

Monitoring Controlling

Check:

  • Is it on schedule?
  • Is quality OK?
  • All scopes are staying within scope?
  • Is initial risk management good enough?

Closing

  • Check and see if project has 1 come to the end of schedule, 2 complete objectivity.

Working with Pinnacle API – Get Leagues

This is a follow up article of  Working with Pinnacle API.

Next operation is to get a list of leagues under a specific sports.  Let’s try to continue from the previous view and get list of leagues.

Let’s go to PinnacleAPIHandler and make another class function to call the API operation v2/leagues?sportid={sportid}, and also get new Model ready for request and response of this API.

Continue reading “Working with Pinnacle API – Get Leagues”

Working with Pinnacle API

Today, I am here playing with Pinnacle API, get game information, run statistics, using C#.

The referecen document is here at “Getting started” https://www.pinnacle.com/en/api/manual#getting-started

First you need to create account here, https://www.pinnaclesports.com/secure/signup.aspx

Then get your client ID, store it to secure place, suggest place will be inside Web.config.

Please be sure that your account is funded, else the credential will not work.

OK, assuming that you got your account and got your credential in your web.config.

Continue reading “Working with Pinnacle API”

Custom Error Page for ASP.Net MVC

An error happens, and when it happens you want to display something nice to users, here is how.

Go to web.config, find <system.web>

Now go to App_Start/FilterConfig.cs, make sure filter is added, as below;

Then go to Views/_Shared, create new view.  Example;

 

ASP.Net Return Empty Results

When you get request but not able to return appropriate result back, it is good idea to return http status code to let them know what’s wrong.

https://msdn.microsoft.com/en-us/library/system.web.mvc.httpstatuscoderesult(v=vs.118).aspx

And here is the list full

https://msdn.microsoft.com/en-us/library/system.net.httpstatuscode(v=vs.110).aspx

Ones that I use often

When input parameter is not valid
HttpStatusCodeResult(HttpStatusCode.BadRequest)

When the access is unauthorized
HttpStatusCodeResult(HttpStatusCode.Unauthorized)

When there is nothing to show (likely had no matching data record to create page)
HttpStatusCodeResult(HttpStatusCode.NotFound)

And when you want to tell search engine that page has been moved permanently, then use