sábado, 27 de julio de 2013

SAP HANA OData and R

As you might have discovered by now...I love R...it's just an amazing programming language...

By now...I have integrate R and SAP HANA via ODBC and via the SAP HANA-R integration...but I have completely left out the SAP HANA OData capabilities.

For this blog, we're going to create a simple Attribute View, expose it via SAP HANA and then consume it on R to display a nice and fancy graphic -;)

First, let's create an Attribute View and call it FLIGHTS. This Attribute View is going to be composed of the tables SPFLI, SCARR and SFLIGHT and will output the fields PRICE, CURRENCY, CITYFROM, CITYTO, DISTANCE, CARRID and CARRNAME. If you wonder why so many fields? Just so I can use it in another examples -;)


With the Attribute View ready, we can create a project in the repository and the necessary files to expose it as an OData service.

First, we create the .xsapp file...which should be empty -:P

Then, we create the .xsaccess file with the following code...

.xsaccess
{
          "exposed" : true,
          "authentication" : [ { "method" : "Basic" } ]
}

Finally, we create a file called flights.xsodata

flights.xodata
service {
          "BlagStuff/FLIGHTS.attributeview" as "FLIGHTS" keys generate local "Id";
}

When everything is ready...we can call our service to test it...we can call it as either JSON or XML. For this example, we're going to call it as XML.


Now that we know its working...we can go and code with R -:D For this...we're going to need 3 packages (That you can install via RStudio or R itself), ggplot2, RCurl and XML.

HANA_OData_and_R.R
library("ggplot2")
library("RCurl")
library("XML")
web_page = getURL("XXX:8000/BlagStuff/flights.xsodata/FLIGHTS?$format=xml", userpwd = "SYSTEM:******")
doc <- xmlTreeParse(web_page, getDTD = F,useInternalNodes=T)
r <- xmlRoot(doc)
 
carrid<-list()
carrid_list<-list()
carrid_big_list<-list()
price<-list()
price_list<-list()
price_big_list<-list()
currency<-list()
currency_list<-list()
currency_big_list<-list()
 
for(i in 5:xmlSize(r)){
  carrid[1]<-xmlValue(r[[i]][[5]][[1]][[2]])
  carrid_list[i]<-carrid[1]
  price[1]<-xmlValue(r[[i]][[5]][[1]][[8]])
  price_list[i]<-price[1]
  currency[1]<-xmlValue(r[[i]][[5]][[1]][[7]])
  currency_list[i]<-currency[1] 
}
 
carrid_big_list<-unlist(carrid_list)
price_big_list<-unlist(price_list)
currency_big_list<-unlist(currency_list)
flights_table<-data.frame(CARRID=as.character(carrid_big_list),PRICE=as.numeric(price_big_list),
                          CURRENCY=as.character(currency_big_list))
flights_agg<-aggregate(PRICE~.,data=flights_table, FUN=sum)
flights_agg<-flights_agg[order(flights_agg$CARRID),]
 
flights_table<-data.frame(CARRID=as.character(flights_agg$CARRID),PRICE=as.character(flights_agg$PRICE),
                          CURRENCY=as.character(flights_agg$CURRENCY))
 
ggplot(flights_table, aes(x=CARRID, y=PRICE, fill=CURRENCY)) + geom_histogram(binwidth=.5, 
       position="dodge", stat="identity")

Basically, we're are reading the OData service that comes in XML format and parsing it into a tree so we can extract it's components. One thing that might call your attention is that we're using xmlValue(r[[i]][[5]][[1]][[2]]) where i starts from 5.

Well...there's an easy explanation -:) if we access our XML tree...the first value it's going to be "feed", the second "id" and so on...the fifth is going to be "entry" which is what we need. Then for the next [[5]]...inside "entry", the first value it's going to be "id", the second "title" and so on...the fifth is going to be "content" which is what we need. Then for the next [[1]]...inside "content", the first value it's going to be "properties" which is what we need. And for the last [[2]]...inside "properties" the first value it's going to be "id" and the second it's going to "carrid" which is what we need. BTW, xmlValue will get the value of the XML tag -:P

In other words...we need to analyze the XML schema and determine what we need to extract...after that, we simply need to assign those values to variables and create our data.frame.

Then we create an aggregation to sum the PRICE values (In other words, we're going to have the PRICE grouped by CARRID and CURRENCY), then we sort the values and finally we create a new data.frame so we can present the PRICE as character instead of numeric...just for better presentation of the graphic...

Finally...we call the plot and we're done -:)


Happy plotting! -:)

Greetings,

Blag.

jueves, 25 de julio de 2013

SAP HANA and R - Keep shining

Since I discovered Shiny and published my blog A Shiny example - SAP HANA, R and Shiny I always wanted to actually run a Shiny application from SAP HANA Studio, instead of having to call it from RStudio and having to use an ODBC connection.

A couple of days ago...this blog Let R Embrace Data Visualization in HANA Studio gave me the power I need to keep working on this...but of course...life is not that beautiful so I still need to do lots of things in order to get this done...

First...cygwin didn't worked for me -:(  so I used Xming instead -;)

Now...one thing that it's really important is to have all the X11 packages loaded into the R Server...so just do this...

Connect to your R Server via Putty and then type "yast" to enter the "Yet another setup tool". (Make sure you tick the X11 Forwarding)...


Search and install everything related to X11-Devel. Also install/update your Firefox browser (also on yast).


With that ready...we can keep going -;)

If you had R installed already...please delete it...as easy as this...

Deleting_R
rm -r R-2.15.0

Then, download the source again...keep in mind that we will need R-2.15.1

Get_R_Again
wget http://cran.r-project.org/src/base/R-2/R-2.15.1.tar.gz

Now...we need support for jpeg images...so let's download a couple of files...

Getting_support_for_images
wget http://prdownloads.sourceforge.net/libpng/libpng-1.6.3.tar.gz?download
wget http://www.ijg.org/files/jpegsrc.v9.tar.gz
 
tar zxf libpng-1.6.3.tar.gz
tar zxf jpegsrc.v9.tar.gz
 
mv libpng-1.6.3 R-2.15.1/src/gnuwin32/bitmap/libpng
mv jpeg-9 R-2.15.1/src/gnuwin32/bitmap/jpeg-9
 
cd R-2.15.1/src/gnuwin32/
cp MkRules.dist MkRules.local
vi MkRules.local

When you run vi on the file you should comment out the bitmap.dll source directory lines just like in the image (notice that I'm not dealing with TIFF images, as they didn't worked for me)...


Now, we need to into each folder and compile the libraries...

Compiling_libraries
cd R-2.15.1/src/gnuwin32/bitmap/libpng
./confire
make
make install
cd ..
cd jpeg-9
./configure
make
make install

When both libraries finished compiling...we can go an compile R -;)

Compiling_R
cd
cd R-2.15.1
./configure --enable-R-shlib --with-readline=no --with-x=yes
make clean
make
make install

As you can see...where using the parameter --with-x=yes to indicate that we want to have X11 into our R installation. As we compiled the JPEG and PNG libraries first...we will have support for this on R as well -;)

For sure...this will take a while...R compilation is a hard task -:P But in the end you should be able to confirm by doing this...

Checking_installation
R
capabilities()


Now...it's time to install Shiny -8)



Installing_Shiny
install.packages("shiny", dependencies=TRUE)

Easy as cake -:)

But here comes another tricky part...we need to create a new user...why? Because we mostly had a previous user to run the Rserve...that was created before we installed X11...so just create a new one -:)
Creating_new_user
useradd -m login_name
passwd login_name

For the X11 to work perfectly...we need to do another thing...

Get_Magic_Cookie
xauth list
echo $DISPLAY

This will return us a line that we should copy in a notepad...then...we need to log of and log in again via Putty (with the X11 Forwarding) but this time using our new user...the second line will tell us about the display, so copy that one as well...
Once logged with the new user...do this...

Assign_Magic_Cookie_and_Display
xauth add //Magic_Cookie_from_Notepad//
export DISPLAY=localhost:**.* //number get from the $DISPLAY...like 10.0 or 11.0

Now...we're are complete ready to go...

Start the Rserve server like this...

Start_Rserve
R CMD Rserve --RS-port 6311 --no-save --RS-encoding "utf8"

When our Rserve is up and running...it's time for SAP HANA to make it's entrance -;) What I really like about Shiny...is that...in the past you needed to create two files to make it work UI.R and Server.R...right now...Shiny uses the Bootstrap framework so we can create the webpage using just one file...or call it directly from the SAP HANA Studio -;)

Calling_Shiny_from_SAP_HANA_Studio.sql
CREATE TYPE SNVOICE AS TABLE(
CARRID CHAR(3),
FLDATE CHAR(8),
AMOUNT DECIMAL(15,2)
);
 
CREATE TYPE DUMMY AS TABLE(
ID INT
);
 
CREATE PROCEDURE GetShiny(IN t_snvoice SNVOICE, OUT t_dummy DUMMY)
LANGUAGE RLANG AS
BEGIN
library("shiny")
 
runApp(list(
  ui = bootstrapPage(
    pageWithSidebar(
      headerPanel("SAP HANA and R using Shiny"),
      sidebarPanel(selectInput("n","Select Year:",list("2010"="2010","2011"="2011","2012"="2012"))),
      mainPanel(plotOutput('plot', width="100%", height="800px"))
    )),
  server = function(input, output) {
    output$plot <- renderPlot({
      year<-paste("",input$n,sep='')
      t_snvoice$FLDATE<-format(as.Date(as.character(t_snvoice$FLDATE),"%Y%m%d"))
      snvoice<-subset(t_snvoice,format(as.Date(t_snvoice$FLDATE),"%Y") == year)
      snvoice_frame<-data.frame(CARRID=snvoice$CARRID,FLDATE=snvoice$FLDATE,AMOUNT=snvoice$AMOUNT)
      snvoice_agg<-aggregate(AMOUNT~CARRID,data=snvoice_frame,FUN=sum)
      pct<-round(snvoice_agg$AMOUNT/sum(snvoice_agg$AMOUNT)*100)
      labels<-paste(snvoice_agg$CARRID," ",pct,"%",sep="")
      pie(snvoice_agg$AMOUNT,labels=labels)
    })
  }
))
END;
 
CREATE PROCEDURE Call_Shiny()
LANGUAGE SQLSCRIPT AS
BEGIN
snvoice = SELECT CARRID, FLDATE, AMOUNT FROM SFLIGHT.SNVOICE WHERE CURRENCY = 'USD';
CALL GetShiny(:snvoice,DUMMY) WITH OVERVIEW;
END;
 
CALL Call_Shiny

I'm not going to explain the code, because you should learn some R and Shiny -:P But if you wonder why I have a "dummy" table...it's mainly because you can't create an Stored Procedure in R Lang that doesn't have an OUT parameter...so...does nothing but helps to run the code :)

When we call the script or the procedure Call_Shiny, the X11 from our Server is going to call Firefox which is going to appear on our desktop like this...


We can choose between 2010, 2011 and 2012...every time we choose a new value, the graphic will be automatically updated...


Before we finish...keep in mind that this approach is really slow...our R Server will send the information via X11 Forwarding to our machine, and will render the Firefox browser...also...we have a timer...so after so many seconds...we will have a Timeout...of course this can be configured, but for Performance purposes...we should limit the time the communication between our SAP HANA and R servers...

Hope you like this blog -:) and see you on the next one -;)

Greetings,

Blag.