Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-692

nxparser fails with uppercase language tag

    Details

      Description

      I am hitting an NPE inside nxparser for the tbl 6-degrees of freedom crawl at [1]. I see this in the log:

      Aug 22, 2012 10:25:59 AM org.semanticweb.yars.nx.Literal getData WARNING: Something wrong with the literal-backing string. The parsing regex pattern didn't match. Check the string for correct N3 syntax. The malicious string is: "Up to 11.9"@FR
      

      And then this trace.

      Caused by: java.lang.NullPointerException
      	at org.semanticweb.yars.nx.util.NxUtil.unescape(NxUtil.java:178)
      	at org.semanticweb.yars.nx.util.NxUtil.unescape(NxUtil.java:164)
      	at org.semanticweb.yars.nx.Literal.toString(Literal.java:235)
      	at com.bigdata.rdf.rio.nquads.NQuadsParser.parse(NQuadsParser.java:297)
      	at com.bigdata.rdf.rio.nquads.NQuadsParser.parse(NQuadsParser.java:178)
      

      AndreasHarth wrote:

      The issue is an uppercase language string.
      
      If you change PATTERN in Literal.java (add A-Z to the regex):
      private static final Pattern PATTERN = Pattern
      		
      .compile("(?:\"(.*)\")(?:@([a-zA-Z]+(?:-[a-zA-Z0-9]+)*)|\\^\\^(<\\S+>))?");
      
      it'll parse fine.
      
      We're looking into the issue to decide where to put in a fix (probably
      do a toLowerCase() for language tags).
      

      I have added a unit test for bigdata which verifies the problem.

      You can work around the problem by modifying the nxparser source code as indicated above. The bug is against nxparser 1.2.2. There is a bug report against nxparser for this as well
      - see http://code.google.com/p/nxparser/issues/detail?id=9

        Activity

          People

          • Assignee:
            bryanthompson bryanthompson
            Reporter:
            bryanthompson bryanthompson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: