2011/12/05

Maven: The Aggregator vs. The Parent

I have recently realized that even professionals working with Maven for extended period do not fully grasp the differences of Maven's aggregators and parents. The reason for that is probably that both terms are related to multi-module projects and that both approaches often 'meet' in single file.

Multi-module projects usually have rather flat structure with a single top-level pom.xml file. That file than lists sub-modules and defines versions of dependencies and/or plugins inherited in these sub-modules. This is well-known thing - what is less known is that these roles of pom.xml can be  separated.

Aggregator
A top-level module serving for joining multiple modules under one roof is called aggregator. Its purpose is only represent more or less independently existing modules as a parts of a greater whole. 

Example of aggreator pom.xml:
<project xmlns="...">

<modelVersion>4.0.0</modelVersion>

<groupId>org.bithill</groupId>
<artifactId>aggregator</artifactId>
<packaging>pom</packaging>
<version>1.0</version>
<name>Project Aggregator</name>

<modules>
 <module>project1</module>
 <module>project2</module>
</modules>

</project>

Parent

As you see, aggegator does not include any information about dependencies. The source of to-be-inherited information about libraries and plugins is known as a parent POM. It includes all the properties, dependencyManagement and pluginManagement sections stating versions of projects dependencies and plugins and some plugin configurations when it comes handy.  Ideally this information should be de-duplicated and inherited by plugins is sub-modules, but that does not apply to reporting plugins.

Example of parent pom.xml:
<project xmlns="...">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.bithill</groupId>
    <artifactId>parent</artifactId>
    <packaging>pom</packaging>
    <version>1.0</version>
    <name>shared parent</name>

    <properties>
      <java.version>1.6</java.version>
      <spring.version>3.0.2.RELEASE</spring.version>
    </properties>

    <dependencyManagement>
      <dependencies>
        <dependency>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-api</artifactId>
          <version>1.6.0</version>
        </dependency>

        <dependency>
          <groupId>org.springframework</groupId>
          <artifactId>spring-context</artifactId>
          <version>${spring.version}</version>
        </dependency>
        <dependency>
          <groupId>org.springframework</groupId>
          <artifactId>spring-aop</artifactId>
          <version>${spring.version}</version>
        </dependency>
      </dependencies>
    </dependencyManagement>

    <build>
      <pluginManagement>
        <plugins>
          <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>2.3.2</version>
            <configuration>
              <source>${java.jdk.version}</source>
              <target>${java.jdk.version}</target>
            </configuration>
          </plugin>
        </plugins>
      </pluginManagement>
    </build>
</project>

Using Aggregator and Parent POM Together

So we showed that we can have two different artificial POM files serving two different roles - aggregation and inheritance.   


Diagram of relationships in a project consisting of two sub-modules:

The last missing thing in the picture is an example of sub-module's pom.xml. As you see, no dependency or build plugin need  to define their version - that is inherited from parent POM. Parent's pom.xml is deployed in Maven repository, but Maven's default relative path to parent is ".." -  to avoid aggregator being used as parent, property relativePath must be set empty, this is probably the only trick here.
<project xmlns="...">

    <modelVersion>4.0.0</modelVersion>
    <groupId>org.bithill</groupId>
    <artifactId>project1</artifactId>
    <packaging>pom</packaging>
    <version>1.6-SNAPSHOT</version>
    <name>Project #1</name>

    <parent>
      <groupId>org.bithill</groupId>
      <artifactId>parent</artifactId>
      <version>1.0</version>
      <relativePath/>
    </parent>
  
    <dependencies>
      <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-context</artifactId>
      </dependency>
      <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-aop</artifactId>
      </dependency>
    </dependencies>

    <build>
      <plugins>
        <plugin>
          <groupId>org.apache.maven.plugins</groupId>
          <artifactId>maven-compiler-plugin</artifactId>
        </plugin>
      </plugins>
    </build>

</project>

2011/11/20

Book Review: Apache Maven 3 Cookbook

I have decided to review new book about Maven from Packt Publishing: Apache Maven 3 Cookbook by Srirangan promising on its cover "Quick anwers to common problems".

The book is divided to nine chapters:
  1. basics
    Maven installation and environment settings. Generating, compiling and testing simple project. POM structure, build lifecycle and profiles.
  2. software engineering techniques
    Modularization, dependency management, static code analysis, JUnit, Selenium.
  3. agile team collaboration
    Nexus, Hudson, version control, offline mode.
  4. reporting and documentation
    Mvn site, javadocs, test and code quality reports, dahsboard.
  5. Java development
    Building and running web application (jetty), JEE, Spring, Hibernate, Seam.
  6. Google development
    Android, GWT, App Engine.
  7. Scala, Groovy and Flex
  8. IDE integration
    Eclipse, NetBeans, Intellij IDEA
  9. extending Maven
    plugin development basics
The book is certainly not material for beginners. Some terms are used without former definition or even explaining. Experienced Maven users will already know or will be able to find the missing pieces but beginners must be terribly confused. I think providing at least a description of Maven's standard directory layout for project or repository would be beneficial.

It is not so much about Maven as I expected, and says nothing about what is new in Maven 3. It should be exhaustive at least in the purely Maven parts, but even there is not enough information to make me happy. I would expect a description of template languages in the part dedicated to site plugin.

If you like puzzles and do not mind to use internet to search for missing pieces or you just cannot recall some setting covered in the book, you will like the book. It can also be used as a good starting point to show what tools, frameworks or languages can be used from/with Maven - do you like Maven and wanted to try GWT or Scala ? The book helps to lower entrance barrier by providing examples how to get working "playground" project in no time.

Would I buy it ? No. My bookcase has limited capacity and since the necessity to use internet to fill the gaps or correct bugs (although not many) in the book, I will be better with the online sources only.

2011/11/06

Shortcuts in Eclipse / Intellij Idea

I decided to publish this long ago started list of keyboard shortcuts for Eclipse and Intellij Idea to make the transition from one IDE to the other one easier for my colleague. I hope it will be useful for other developers in similar situation too. Where you see only one shortcut in the list, it means it is identical in both IDEs.

List Shortcuts: Ctrl+Shift+L / ? (Help - Default Keymap Reference)
Search Actions: ? / Ctrl+Shift+A

Editor


Select All: Ctrl+A
Reformat: Ctrl+Shift+F / Ctrl+Alt+L
Open File By Name: Ctrl+Shift+R / Ctrl+Shift+N
Open Class By Name: Ctrl+Shift+T / Ctrl+N
Go to Matching Bracket:  Ctrl+Shift+P / Ctrl+{, Ctrl+}
Paste from Clipboard Stack: ? / Ctrl+Shift+V
Vertical Blocks: ? / Alt+Shift+Insert

Code Navigation and Manipulation

Navigate Forward in History: Alt+right arrow / Ctrl+Alt+right arrow
Navigate Backward in History: Alt+left arrow / Ctrl+Alt+left arrow

Go to Line: Ctrl+L / Ctrl+G

Delete Line: Ctrl+D / Ctrl+Y

Open Documentation: "mouse over" / Ctrl+Q
Open Declaration: F3 / Ctrl+B
Open Hierarchy: F4 / Ctrl+H
Find Implementation: ? / Ctrl+Alt+ B

Find Usages: Ctrl+Alt+G / Alt+F7
Usages Pop-Up: ? / Ctrl-Alt-F7

Code Completion: Ctrl+Space
Optimize Imports: Ctrl+Shift+O / Ctrl+Alt+O
List of Methods to Override/Implement: ? / Ctrl+O
Generate...: Ctrl+Shift+G / Alt+Insert

Comment LineCtrl + /
Comment BlockCtrl + Shift + /

Code FoldCtrl+"numpad +"  / or Ctrl+"+" 
Code UnfoldCtrl+"numpad -"  /  Ctrl+"-" 

Move Code Up: Alt+Up / Ctrl+Shift+Up
Move Code Down: Alt+Down / Ctrl+Shift+Down 

Create Test: ? / Ctrl+Shift+T

Refactoring

Rename: Alt+Shift+R / Shift+F6
Extract Method: Alt+Shift+M / Ctrl+Alt+M
Introduce Variable: Ctrl+Shift+M / Ctrl+Alt+V

Search

In Current File: Ctrl+F
In All Files: Ctrl+H / Ctrl+Shift+F

Version Control

Commit Changes: Alt+C / Ctrl+K
Update: Alt+U / Ctrl+T
VCS Popup: ? / Alt+` (back qoute)

Windows

Maximize: Ctrl+M / Ctrl+Shift+F12

Debugging

Debug: F11 / Shif+F9
Step Into: F5 / F7
Selective/smart Step Into:  Ctrl+F5 / Shift+F7
Step Over: F6 / F8
Step Out: F7 / Shift+F8
Resume: F8 / F9

Evaluate Expression: Ctrl+Shift+I / Alt+F8

Jump To Caller: ? / Ctrl+Alt+F7
Show the Caller Hierarchy:  ? / Ctrl+Alt+H

I want to continually update this list depending on what IDE I use - if you have anything that should be added to the list or you know replacement for some question mark in it, let me know. Thanks to all who already contributed.

2011/10/17

Remarkable Changes in Past Versions of Selenium 2 WebDriver

For a long time I kept the code base of our tests running on Selenium 2.0a6. The reasons for not upgrading were different for different Selenium 2 versions - perceived stability problems in InternetExplorer or Firefox, changes being not to enough beneficial for our tests and sometimes even lack of time for the change.

Now it seems Selenium 2.8 is a good candidate for upgrade.  Following enumeration summarizes the changes we waited for and I want to use it as a thanks to all the Selenium 2 developers who participated in the effort. I also hope it wil help to anybody upgrading or considering the upgrade to a newer Selenium 2 .

1/  RenderedWebElement deprecated and removed in 2.0rc3,  method isDisplayed() was moved to WebElement class.

2/ Mouseover works since 2.0 RC2:

import org.openqa.selenium.interactions.Actions;
import org.openqa.selenium.interactions.Action;
Actions builder = new Actions(driver);
Action hoverAction = builder.moveToElement(mouseOverElement).build();
hoverAction.perform();

3/ For some time it was necessary to send Enter to a button in MSIE to press it,
but since version 2.2 clicking on buttons (WebElement.click()) seems to work flawlessly.

4/ Version 2.3 brought the nice Alert class for confirmation and alert dialogs, rendering thus JavaScript workarounds obsolete:

package org.openqa.selenium;
public interface Alert
{
  void dismiss();
  void accept();
  String getText();
  void sendKeys(String keysToSend);
}

the usage:

Alert prompt = driver.switchTo().alert();
// some short sleep here
log.debug( prompt.getText() );
prompt.sendKeys("AAA");
// some short sleep here
prompt.accept();

5/ And finally, since 2.8 important parts of the advanced interactions, double-click and right-click, work both for MSIE end FF (since 2.5for MSIE):

Actions builder = new Actions(driver);
Action doubleClick = builder.doubleClick(element).build();
doubleClick.perform();

Actions builder = new Actions(driver);
Action rightClick = builder.contextClick(element).build();
rightClick.perform();

2011/10/12

Java Applets - Building with Maven, Communicating with JavaScript

I have recently realized that I've not written any applet for some time and need to refresh my know-how. I hope writing it here will save some time for you if you are in similar situation.

The example will show the communication of Java Applet with Javascript in both ways -  Java-to-JS and JS-to-Java.

Building Applet with Maven

To help Maven to find applet classes and compile the applet, you need to add Java plugin dependency:
<dependency>
  <groupId>sun.plugin</groupId>
  <artifactId>plugin</artifactId>
  <version>1.6</version>
  <scope>system</scope>
  <systemPath>${java.home}/lib/plugin.jar</systemPath>
</dependency>

You'll also want to include content of your manifest file as the default genreated one does not contain Main-Class property/header:
<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-jar-plugin</artifactId>
      <configuration>
        <archive>
          <manifestFile>src/main/resources/META-INF/MANIFEST.MF</manifestFile>
        </archive>
      </configuration>
    </plugin>
  </plugins>
</build>

Example of the manifest:
Manifest-Version: 1.0
Main-Class: org.bithill.SimpleApplet

Deployment

One of things that changed significantly since writing my last applet  is an introduction of the deployment toolkit - a JavaScript library for including an applet into a web page. It looks like a marvel when compared to the tedious and error-prone applet or object tag creation and checking of the browser differences.

<script src="deployJava.js"></script>
<script>
  var attributes =
  {
    id: 'simpleApplet',
    codebase:'../../../target', // directory with the jar
    code:'org.bithill.SimpleApplet.class',
    archive:'SimpleApplet-1.0.jar',
    width: 100, height: 50,
    boxbgcolor: '#eeeeee'
  };
  var parameters = {};
  var version = '1.6';

  deployJava.runApplet(attributes, parameters, version);
</script> 
The Applet and Its Iteraction with JavaScript
package org.bithill;

import static java.lang.System.out;
import netscape.javascript.JSException;
import netscape.javascript.JSObject;
import java.applet.Applet;
import java.util.Date;

public class SimpleApplet extends Applet
{
   private JSObject js; // object for communication with JavaScript

   /** Evaluates given JavaScript expression.
    * @param jsExpression expression
    */
   public String jsEval(String jsExpression)
   {
      try { js.eval(jsExpression); }
      catch (JSException ex) { ex.printStackTrace();  }
      return new Date() + " | " + jsExpression;
   }

    /** Initializes the applet.
     *  It's called only once before the applet is started.
     */
    @Override
    public void init()
    {
       try { js = JSObject.getWindow(this);  }
       catch (JSException ex) { ex.printStackTrace();  }
    }
}

The applet gets reference to JSObject for interacting with JavaScript engine instance in the page. It is then used in jsEval() method which also return time-stamped input to demonstrate reading of the method's rerurn value. In page you need only several lines of JavaScript to connect the things together:

<script>function getAlertExpression(msg) { return "alert('" + msg + "')"; }</script>

<button onclick="document.getElementById('msgbox').innerText = ( simpleApplet.jsEval( getAlertExpression('HI') ) )">
show alert
</button>

<div id="msgbox">-- last action --</div> 

How does it work? The button has JavaScript onclick handler that calls the applet's the jsEval() method and puts its retrurn value to prepared msgbox div. Notice that applet's id is used as reference. Method jsEval() evalueates JavaScript expression in the page -that results in showing an alert dialog.

2011/09/18

Using Smart Card as Keystore in Java, setup

Using a smart card as a key store promises stronger security compared to storing keys or certificates on a disk. This can be further improved by using a card reader with a PIN pad, an effective counter-measure against key loggers.

This article should provide basic information how to use smart card as key store for Java applications. You do not need an expensive card for such application - a cheaper, specialized crypto-card will do. The installation instructions in this article focus on Linux, as it is my preferred platform and the setup a bit more complicated than on Windows.

The stack

                 application
                      |
           java.security.Keystore
                      |
                     JVM
                      |
                PKCS11 provider
                      |
              PC/SC middleware
                      |
                    CCID
                      |
            USB smart card reader
                      |
                 smart card 

Installing Software
  1. Download a driver for your smart card reader from its producer's page and install it.
  2. Download and install PC/SC middleware - PCSC-Lite. It does not require  any configuration if you use USB reader.
  3. Get PKCS11 provider for your card. You can use open-source (OpenSC) or producer 's implementation, depending on which one works better with Java.

Setting PKCS11 Token for Java 

First you have to configure PKCS11 provider for Java. Open $JAVA_HOME/jre/lib/security/java.security and look for registered security providers - find lines starting with text security.provider. Add a new security provider by adding line security.provider.9=sun.security.pkcs11.SunPKCS11 /etc/pkcs11_java.cfg . Sun PKCS#11 provider allows integration of PKCS11 tokens with Java platform by interfacing a native library, usually delivered by the token producer.

The configuration file following the provider's fully qualified name may contain various PKCS11 settings. It usually contains only the three lines we can see in this setting for OpenSC:

name = OpenSC-PKCS11
description = SunPKCS11 via OpenSC
library = /usr/lib/opensc-pkcs11.so

The entry name serves as name of the PKCS11 provider and description is AFAIK optional. The most important is the library property, it contains a path to the PKCS11 implementation we want to use. 

Depending on environment in which the application will be used we would need  need to create a custom security policy,  the name of the provider is prefixed with "SunPKCS11-" :

grant { 
       permission java.security.SecurityPermission
       "authProvider.SunPKCS11-OpenSC-PKCS11";
 };


In the second part we will see how to create key and certificate, load them into the card and use the key on card to sign and verify.

2011/09/16

Basic Key/Certificates Manipulation by OpenSSL

Getting Server's SSL/TLS Certificate Chain


openssl s_client -connect some_hostname:443 -showcerts
X.509 certificates are dumped as base64-encoded strings between -----BEGIN CERTIFICATE----- and -----END CERTIFICATE----- headers. They should be (together with the headers) stored in files with .pem suffix.

We can look at the certificate information then:
openssl x509 -in cert.pem -inform PEM -noout -text

    Conversion of Key and Certificate Formats

    Keys

    • PKCS1 – PEM to DER
      openssl rsa -in key.pem -out key.der -inform pem -outform der 

      The key format is reflected in the header (of the key.pem):
      • PKCS#1 - BEGIN RSA PRIVATE KEY, BEGIN RSA PUBLIC KEY
      • PKCS#8 - BEGIN PRIVATE KEY, BEGIN ENCRYPTED PRIVATE KEY

    Certificates

    • PEM to P12
      openssl pkcs12 -export -out cert.p12 -in cert.pem -inkey key.pem
    • PEM to DER
      openssl x509 -in cert.pem -inform PEM -out cert.der -outform DER

    2011/09/13

    Setting Firefox Preferences via Selenium 2 (WebDriver API)

    I wanted to run Firefox from WebDriver with custom preferences.  So I looked into the well-known about:config for name of the option and, just to be sure, consulted About:config entries section of Mozilla Wiki.


    With the name of desired config option to change, the rest is a pieceof cake:

    FirefoxProfile profile = new FirefoxProfile();
    profile.setPreference("dom.event.contextmenu.enabled",false);
    WebDriver webDriver =  new FirefoxDriver(profile);
    

    2011/08/02

    Fast Inserts to PostgreSQL with JDBC and COPY FROM

    I was reading some materials on how to make database inserts as efficient as possible from Java. it was motivated by an already existing application for storing of some measurements into PosgtreSQL. So I decided to compare known approaches to see if there is some way how to improve the application already using batched inserts.

    For the purpose of  the test I created following table:

    CREATE TABLE measurement
    (
      measurement_id bigint NOT NULL,
      valid_ts timestamp with time zone NOT NULL,
      measurement_value numeric(19,4) NOT NULL,
      CONSTRAINT pk_mv_raw PRIMARY KEY (measurement_id, valid_ts)
    )
    WITH (OIDS=FALSE)
    

    I decided to test the insertion of 1000 records to the table. The data for the recors was generated before running of any of test methods. Four test methods were created to reflect ususal approaches:
    • VSI (Very Stupid Inserts) - executing queries made of concatenated Strings one by one
    • SPI  (Stupid Prepared Inserts) - similar to VSI but using prepared statements
    • BPI (Batched Prepared Inserts) - prepared inserts, executed in batches of various length
    • CPI (Copy Inserts) - inserts based on COPY FROM, executed in batches of various length
    Prior to each inserts the table is cleared, the same after all data are succesfully inserted. Commit is called only once in each test method, following all the insert calls.  The following code exerpts illustrate the above listed approaches:

    VSI

    for (int i=0; i<testSize; i++)
    {
      insertSQL = "insert into measurement values (" 
                + measurementIds[i] +",'"+ timestamps[i] +"',"+values[i]+")";
      insert.execute(insertSQL);
    }
    

    SPI
    PreparedStatement insert = conn.prepareStatement("insert into measurement values (?,?,?)");
    for (int i=0; i<testSize; i++)
    {
      insert.setLong(1,measurementIds[i]);
      insert.setTimestamp(2, timestamps[i]);
      insert.setBigDecimal(3, values[i]);
      insert.execute();
    }
    

    BPI

    PreparedStatement insert = conn.prepareStatement("insert into measurement values (?,?,?)");
    
    for (int i=0; i<testSize; i++)
    {
      insert.setLong(1,measurementIds[i]);
      insert.setTimestamp(2, timestamps[i]);
      insert.setBigDecimal(3, values[i]);
      insert.addBatch();
      if (i % batchSize == 0) { insert.executeBatch(); }
    }
    insert.executeBatch();
    

    CPI

    StringBuilder sb = new StringBuilder();
    CopyManager cpManager = ((PGConnection)conn).getCopyAPI();
    PushbackReader reader = new PushbackReader( new StringReader(""), 10000 );
    for (int i=0; i<testSize; i++)
    {
        sb.append(measurementIds[i]).append(",'")
          .append(timestamps[i]).append("',")
          .append(values[i]).append("\n");
        if (i % batchSize == 0)
        {
          reader.unread( sb.toString().toCharArray() );
          cpManager.copyIn("COPY measurement FROM STDIN WITH CSV", reader );
          sb.delete(0,sb.length());
        }
    }
    reader.unread( sb.toString().toCharArray() );
    cpManager.copyIn("COPY measurement FROM STDIN WITH CSV", reader );
    

    I hoped to get some improvements for using COPY FROM instead of batched inserts but not expected no big gain. But the results were a pleasant surprise. For a batch of size 50 (as defined in the original aplication I wanted to improve) the COPY FROM gave 40% improvement.  I expect some improvements when data come from a stream and skip the StringBuffer-with-PushbackReader exercise.

    See the graphs yourself - the number following the method abbreviation is the size of the batch.

    Average time in milliseconds
    All the 200 runs individually

    source code for the "benchmark"

    2011/07/21

    Elasticsearch in 10 Minutes

    We produce a lot of PDF files with documentation. Way too many. The problem is that nobody knows to which document this or that documentation belongs or where it really is.

    Some fulltext would help. I recently stumbled over elasticsearch - schema-free, scalable search engine based on Apache Lucene. I decided to give it a try -  not because its distributed nature, but for its REST interface.

    I did following  4 steps to get simple fulltext search working:

    1/ Extracted text from PDFs using pdftotext and simple bash one-liner.
    for FILE in $(ls *.pdf); do pdftotext $FILE; done

    2/ Created Java Maven project. The elasticsearch's pom.xml I found did not contain necessary dependencies, so I had to add them to my pom.xml and the result is a bit messy.
    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
                                 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>docsearch</groupId>
        <artifactId>docsearch</artifactId>
        <version>1.0</version>
    
        <repositories>
         <repository>
           <id>fuse</id>
           <url>http://repo.fusesource.com/maven2/</url>
         </repository>
        </repositories>
    
        <dependencies>
          <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>0.16.0</version>
          </dependency>
    
          <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-core</artifactId>
            <version>3.3.0</version>
          </dependency>           
    
          <dependency>                  
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-analyzers</artifactId>   
            <version>3.3.0</version>                            
          </dependency>                                               
    
          <dependency>                                                      
            <groupId>org.apache.lucene</groupId>                                    
            <artifactId>lucene-snowball</artifactId>                                          
            <version>3.0.3</version>                                                                    
          </dependency>                                                                                         
                                                                                                                
          <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-fast-vector-highlighter</artifactId>
            <version>3.0.3</version>
          </dependency>
    
          <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-highlighter</artifactId>
            <version>2.4.0</version>
          </dependency>
    
          <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-queries</artifactId>
            <version>2.4.0</version>
          </dependency>
    
        </dependencies>
    
    </project> 
     
    3/ Downloaded the elesticsearch release and started it.

    4/ Wrote a simple Java code to iterate over files, read them line-by-line and feed them to the running elasticsearch service:

    import org.elasticsearch.action.index.IndexResponse;
    import org.elasticsearch.client.Client;
    import org.elasticsearch.client.transport.TransportClient;
    import org.elasticsearch.common.io.Files;
    import org.elasticsearch.common.transport.InetSocketTransportAddress;
    import org.elasticsearch.node.Node;
    
    import java.io.*;
    
    import static java.lang.System.out;
    import static org.elasticsearch.common.xcontent.XContentFactory.*;
    import static org.elasticsearch.node.NodeBuilder.*;
    
    public class Main
    {
      final static String dataDirName = "/tmp/doc";
    
      public static void main (String[] args)
      {
         File dataDir = new File(dataDirName);
    
         if ( dataDir.exists() && dataDir.isDirectory() )
         {
            File[] files = dataDir.listFiles
            (
               new FilenameFilter()
               {  
                  public boolean accept(File dir, String name)
                  { return name.endsWith("txt"); }   
               }
            );
      
            // esearch client creation
            Node node = nodeBuilder().node();
            Client client = new TransportClient()
                           .addTransportAddress(new InetSocketTransportAddress("localhost", 9300));
      
            String indexName = "docs";
            String docType = "doc";
            String docId = null;
            for (File file : files)
            {
               try
               {
                  BufferedReader reader = new BufferedReader ( new FileReader(file) );
                  String line;
                  StringBuilder fileContent = new StringBuilder();
                  while ( (line = reader.readLine()) != null)
                  { fileContent.append(line); }
    
                  docId = file.getName();
                  IndexResponse response =
                    client.prepareIndex(indexName,docType,docId).setSource
                      ( jsonBuilder() .startObject().field("content", fileContent).endObject() )
                    .execute().actionGet();
               }
               catch (FileNotFoundException ex) { ex.printStackTrace(); }
               catch (IOException ex) { ex.printStackTrace(); }
            }
    
            node.close();
         }
      }
    }