Adding CORS support in Jersey server

I have been having trouble in getting the Cross-Origin Resource Sharing (CORS) header on in my server response. I tried many examples on the Web and tried putting filters in my servlet web.xml file but did not have any success.

The following simple lines of code (I got it from an example in stack overflow) did the trick for me and the browser is now able to receive the response (json object) from the service call. I am using Tomcat 7.0.42 and Jersey server 2.23.1.

Create the following java class in your project that has the method CORSRespponseFilter to set header “Access-Control-Allow-Origin” to “*”, which is what we want.

Screenshot from 2016-06-30 12:38:34

Then register the filter in the web.xml file in the init-param section as follows.

Screenshot from 2016-06-30 12:39:31

Then, re-deploy the service and the header should be in the response now and the browser should not have any issues.

 

My AAAI 2015 experience and feelings!

The 29th AAAI conference on Artificial Intelligence (AAAI 2015) was a totally new experience for me in many ways: (1) It was a very big and broad conference (2) Since it was broad, there were papers from many sub areas (3) They had many tracks and even keynotes in parallel (4) Met new students and professionals, and (5) had my paper accepted in the technical track🙂. Following is a brief descriptions on keynotes and papers I came across during AAAI 2015 that may help my colleagues in their research.

Invited talk by Oren Etzioni – Allen Institute for AI

Oren’s keynote was focused around GOFAI (Good Old Fashioned AI) using today’s AI and ML techniques. He mentioned that he and Allen Institute for AI do not focus on inventing or building completely new techniques for AI but utilize available resources in solving problems. He presented two systems: (1) A system that reads and analyzes text to answer questions – ARISTO (2) A system similar to Google Scholar that they refer to as Semantic Scholar. Both the systems look great. ARISTO can answer maths geometry questions by reading the text and also understanding the graphs or sketches which I though is very challenging and exciting. It can successfully answer grade 4 maths geometry questions (also showing good results in grades 9 and 11). The other is the Semantic Scholar which will be available to public this year. It will have more meta data than Google scholar and answer a lot of interesting queries and relationships between the papers.

Invited talk by Rayid Ghani

Rayid’s talk was about data science for social good. I recall that Pramod (one my colleagues) got this scholarship last year. Their effort is to help or make scientists with AI, ML, and data analysis skills to serve the common good. In doing so, they do not pick people who only does social good at the moment but at least showing interest in future. Throughout his speech he mentioned that, being able to pick which method or technique to solve the problem is very important.

Invited talk by Meinolf Sellmann – IBM TJ Watson

Meinolf is from IBM T. J. Watson group. His talk was about some Watson applications and systems using AI techniques. One system they have built is about selecting relevant parameters and algorithms given the data and environment. He mentioned that this is not the very first attempt that anyone who did it but they have improved upon previous efforts. He showed how important it is to automatically select parameters and algorithms when a computer sees completely different data (this in fact is useful for Watson). He also showed a short demo on automatic speech generation application using Watson. His takeaway message for this is that it is not a complex or advanced technology for Watson, but it makes great difference to customers. That is when IBM say that Watson (i.e., computer) can generate a speech for them given a topic using Wikipedia, it makes huge effect in impressing customers. There is a good lesson I learned from his presentation. That is, he used very simple and easy to understand graphics. I recall that his slides had “Disney Cars” pictures in explaining how to select parameters or algorithms in different context. I believe that everybody in the audience understood his points and to-the-point short demos during the presentation kept the audience alive.

Invited talk by Michael Bowling

This talk was about how to build computers to achieve John von Neumann’s dream of playing chess like humans – guessing the other players’ move, deception tactics, and bluffing. This is to behave like real life. The talk composed of how computers advanced in computing many decisions per second over time and their ability to reason. The talk wasn’t as interesting as I thought and I should have attended Lise Getoor’s talk (about statistics and semantics to solve big data problems).

Thoughts about papers and trend

The conference was very big and hence had broader theme for papers. Among the themes that match Kno.e.sis, social science was dominant. I found the trend in this direction now is to predict demographics or emotions for users using social data (including web site traffic). Most of them use machine learning techniques.

Overall comments

AAAI had some interesting new additions this time. They are, lunch with a fellow and job fair. The job fair was nice and it gave students and prospective graduates to talk with some of the leading companies (including Google, Microsoft, IBM, PARC, etc). I had some successful discussions with some of them during the fair.

Got to know some history of AI. Who knew A* search is a by-product of the first robot (Shakey).

AAAI15 had many sessions in parallel and even keynotes.

Many or at least considerable number of papers are about utilizing or improving current AI/ML techniques. There are less number of papers that introduced a completely new research theme.

AI and Web was the track I think had most number of papers.

Since they had many papers, some papers were given two minutes to present and a poster session.

I had good number of people coming to my poster and many seemed to be impressed🙂. At the end, I felt like loosing voice by explaining to that many people. But it was a very nice experience.

My experience in processing SGML files in Java – and some issues with parsing

SGML (Standard Generalized Markup Language) is a pre XML (Extensible Markup Language) version that does now care about “well formedness”.  That is starting and closing tags do not need to match in SGML documents.  I found out what is SGML when I needed to process a SGML file to extract vocabulary terms to a project that I work. I though since it is pretty similar to XML, I should be able to use a XML parser to extract details from a SGML file.

I used default Java parser code which is like follows,

DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse (new File(filename));

I had to remove some tags from the file because there were some tags without any closing tags. I faced one problem and that is I did not have a DTD for the SGML file. So for markup references like “&ldquo” was giving me errors like follows in the parser.

 The entity “ldquo” was referenced, but not declared.

There is a fix for this, that is to use a default DTD available on line (i.e., from W3C) to include as the definitions of these references. So I used the the following DTD declaration in the top of the file.

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd&#8221; >

So I ran the program and it worked fine. But when I tried to run the program few days later, it was stuck at the parsing process. I couldn’t figure out any bug in the code instead of the java parsing mechanism. I searched on the web for possible reasons for this and found out that sometimes, “W3C” blocks url access for some documents because of high traffic. So I assumed that the java parser is trying to download the DTD file for processing or it tries to download something specified in the DTD file. So I downloaded the the DTD reference and included it in the SGML file (at the top) like follows to include the DTD as a local one.

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “xhtml1-strict.dtd”>

But still the problem persists.  Then I figured out that inside this DTD it still refers to some other documents online and I determined that they are not required for my SGML document processsing as it required few simple terms like “&ldquo”. Following is the section I removed from the local DTD file.

<!–================ Character mnemonic entities =========================–>

<!ENTITY % HTMLlat1 PUBLIC
   “-//W3C//ENTITIES Latin 1 for XHTML//EN”
   “xhtml-lat1.ent”>
%HTMLlat1;

<!ENTITY % HTMLsymbol PUBLIC
   “-//W3C//ENTITIES Symbols for XHTML//EN”
   “xhtml-symbol.ent”>
%HTMLsymbol;

<!ENTITY % HTMLspecial PUBLIC
   “-//W3C//ENTITIES Special for XHTML//EN”
   “xhtml-special.ent”>
%HTMLspecial;

After these changes, the default Java XML parser seemed to work fine for my SGML file processing and I was able to remove the halting problem encountered by the Java parser by having a local DTD reference for the on line W3C DTD.

Interesting key-notes at iSemantics 2013

I was attending iSemantics 2013 to present my work on property alignment on linked data and listened to interesting and different key-notes and I tried to recap them here.

 

Tim from Skype

Tim was the lead developer of Skype from the beginning where it had many different challenges. his talk was centered around what would have happened if Skype had done knowledge management about 5 years ago (this year they celebrate 10 years). In his opinion, they lost the user base because they didn’t do well in group chat features and at the time they wanted to pay attention to this and analyze user behaviors, many others were already in the field. Skype had a distinct advantage because as I see, they were the chosen one by many people to contact the loved ones. Skype was popular for making video calls. But they didn’t add new features fast enough and as Tim mentioned, earlier development and environment of Skype was not at all perfect.

 

Ed Chi from Google

Ed is a senior research scientist at Google now working for the Google+ team. He was earlier at the PARC (Palo Alto Research Center) and moved to Google about 3 years ago when they were thinking of developing Google+. His work at Google is interesting and his talk was made from what he learned at PARC and then at Google with Google+. He shared his findings on the Science behind Wikipedia while he was at PARC.  In the initial phase of Wikipedia, it showed exponential growth of page editing and at one point it appeared linear. He explained this theory using economics theory that in real life, because of the finite amount of energy, animals cannot keep on reproducing. For example, if there are two rabbits from the beginning, they will produce another two, then the resultant four will become eight ans so on. But again, we cannot also draw a line where Wikipedia growth ends as it seems to be relative. World’s knowledge we knew 100 years ago has changed and expanded. So it is still unknown how much we know. Also the reason for initial growth rate of Wikipedia is still unanswered.

Then the last part of his talk was about how they developed Google+ and why. According to their analysis, people do not want to share everything with everybody (like in Facebook). They wanted a different privacy setting (this I also imagined at that time). The statistics showed that ~60% of the users did not want to publish everything to everybody. He thinks that this might be the case that still Facebook and twitter is so popular even though they do not have privacy settings. Because, about half the % do not care about to whom they share their data. They came up with this “circle” idea where people can post and only selected people can see your post. He also had some real world example screenshots from Twitter where some politicians posted their status on Twitter unknowingly that it reached everyone (because no privacy settings).

He also talked about his current research direction where he dreams of reaching non-English speaking audience. According to the statistics (not from Google but from PARK because of some restrictions), ~51% of the online social users use English. So if you focus on these people, you loose the half of the users. Their recent analysis on between the different language users showed that there are tight connections between languages. They post the same thing in different languages and sometimes the same user. He thinks that these inter-relationships can be used for recommendations and analysis in their own languages, and also to detect topics, etc. (interesting analytic s and I do not remember everything now).

 

Aldo Gangemi from STLAB

Gangemi’s talk was more semantics oriented and he presented the idea that we need better framing of data representation on the web (Semantic Web). He had few examples how we can do that by relating words in posts. This led to the discussion to how well we should capture data. For example, in DBpedia, the knowledge extraction was incomplete and obvious relationship details are missing. This leads to incomplete representation on the web and also results in adverse effects on applications. Interesting points.

 

Yves Raimond from BBC

Yves talk was about more practical side of Semantic Web, which I think is also important so that we can say somebody is using the technology in the real world settings. He talked about what they have in BBC. They have very big collection of data that BBC broadcasted over the past years. He presented initially that they have the current BBC web pages build on top of RDF and ontologies. One example was the 2010 world cup web page. He also showed the rendering of the web page and RDF links to sources in their web pages. This is cool, very practical and enjoyable.

Then he talked about what they want to and doing now. They are re-modeling the onotologies they have now and add new ones. They want to populate them and make datasets published online in the LInked Data paradigm. They want and currently using LOD techniques in linking resources from the program data they have. But according to him, (as I remember), to process 10 years of program data takes 4 years. First they have to convert them into text and then parse them and link them. For this, they are using now Amazon cloud so that it takes only weeks. But sometimes the program data are incorrect and incomplete. So they want to crowd-source them so that people will identify and correct them. This talk is the one I enjoyed a lot from the four talks as it had practical use of Semantic Web and Linked Data technologies.

Google GWT with Eclipse eats up disk space

I was recently coding with Google GWT  installed in Eclipse and experimenting the code with frequent starting and stopping the application as a web application. What I was noticing is that the remaining disk space (in C drive) is constantly getting reduced for no reason. I wasn’t installing any application but my remaining disk space was reducing. I was surprised to see that I lost around 10 GB of disk space. Then I got the feeling that may be GWT is using the disk space every time I run the program in eclipse (run as web application). Then I found out that for some reason GWT is storing some files in the Local/Temp folder, Removing/deleting these GWT files does not harm the programming environment, experienced by my programming environment. I was using Windows 7 OS and following is the path for you to clean these unwanted files to free disk space when using Google GWT with Eclipse.

C:/Users/<user>/AppData/Local/Temp