RSS

Kibana + ElasticSearch + Logstash + Redis on RHEL 6

Overview

Kibana + ElasticSearch + Logstash + Redis on RHEL 6

  • ElasticSearch: Log search engine
  • Redis: Queuing system broker
  • Logstash: Log shipper and indexer
  • Kibana: UI

Manual configuration steps

ElasticSearch

Warning
Your ElasticSearch must match the version of ElasticSearch in logstash! In this case, we have to install ElasticSearch 0.20.2 because we’re using logstash 1.1.9.http://logstash.net/docs/1.1.9/outputs/elasticsearch
Install ElasticSearch

Download

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.20.2.tar.gz
tar xvf elasticsearch-0.20.2.tar.gz

mv elasticsearch-0.20.2 elasticsearch

elasticsearch.yml

cluster.name: elasticsearch-kibana
node.name: r6x64o11-pv084
path.conf: /usr/local/elasticsearch/config
path.data: /mnt/storage/es-data
path.work: /usr/local/elasticsearch/tmp
path.logs: /usr/local/elasticsearch/logs
bootstrap.mlockall: true
Configure Java Service Wrapper

Get the service wrapper

wget http://github.com/elasticsearch/elasticsearch-servicewrapper/archive/master.zip
unzip master
mv elasticsearch-servicewrapper-master/service/ .
rm -rf master
rm -rf elasticsearch-servicewrapper-master/

Configure elasticsearch.conf

set.default.ES_HOME=/usr/local/elasticsearch
set.default.ES_HEAP_SIZE=4096
wrapper.java.additional.10=-Des.max-open-files=true

wrapper.logfile.maxsize=5m
wrapper.logfile.maxfiles=5

Add ES home to root user’s .bash_profile

# ElasticSearch
export ES_HOME=/usr/local/elasticsearch

Create elasticsearch user

useradd -d /home/elasticsearch -s /bin/sh elasticsearch
chown -R elasticsearch:elasticsearch $ES_HOME
chown -R elasticsearch:elasticsearch /mnt/storage/es-data

Edit elasticsearch user’s .bash_profile

# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH

# JAVA_HOME needs to be the latest 1.7 JDK on the system
JAVA_HOME=/usr/local/jdk7
export JAVA_HOME

#Add JAVA_HOME to the PATH
PATH=$JAVA_HOME/bin:$PATH

# ElasticSearch
export ES_HOME=/usr/local/elasticsearch

unset USERNAME

/etc/security/limits.conf (optional as this will be set in the service script, too)

elasticsearch    soft    nofile          65535
elasticsearch    hard    nofile          65535

Verify the file descriptor limit

sudo -u elasticsearch -s ulimit -Sn

Disable firewall

# /etc/init.d/iptables save
# /etc/init.d/iptables stop
# chkconfig iptables off

Install the service

bin/service/elasticsearch install

/etc/init.d/elasticsearch

# Java
JAVA_HOME=/usr/local/jdk7
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
RUN_AS_USER=elasticsearch
ULIMIT_N=65535

Run the service

bin/service/elasticsearch start
or
service elasticsearch start
ElasticSearch Head

http://kibana.pd.local:9200/_plugin/head/

bin/plugin -install mobz/elasticsearch-head

Redis

Install

wget http://redis.googlecode.com/files/redis-2.6.12.tar.gz
tar xzf redis-2.6.12.tar.gz

mv redis-2.6.12 /usr/local/redis
cd /usr/local/redis
make

Configure Redis – cp redis.conf 6379.conf

daemonize yes
pidfile /var/run/redis/redis_6379.pid
port 6379
timeout 300
tcp-keepalive 60
logfile /var/log/redis/redis_6379.log

Add REDIS home to root user’s .bash_profile

# Redis
export REDIS_HOME=/usr/local/redis

Create redis user

useradd -d /home/redis -s /bin/sh redis
chown -R redis:redis $REDIS_HOME
chmod 700 $REDIS_HOME

Copy Redis init script

cp utils/redis_init_script /etc/init.d/redis_6379

Configure Redis init script

# chkconfig:   - 85 15
# description: Redis is a persistent key-value database
# processname: redis

REDISUSER="redis"
REDISPORT=6379
EXEC=/usr/local/redis/src/redis-server
CLIEXEC=/usr/local/redis/src/redis-cli
PIDFILE=/var/run/redis/redis_6379.pid
CONF="/usr/local/redis/6379.conf"

$EXEC $CONF ==change to==> /bin/su - $REDISUSER -c "$EXEC $CONF"

Activate Redis service

mkdir /var/run/redis /var/log/redis
chown redis:adm /var/run/redis /var/log/redis
sudo chmod 750 /var/log/redis
cd /etc/init.d
chkconfig --add redis_6379

Start

service redis start

Logstash

Download logstash on kibana.pd.local and the log producer

mkdir /usr/local/logstash
cd /usr/local/logstash
wget https://logstash.objects.dreamhost.com/release/logstash-1.1.9-monolithic.jar

Indexer configuration – vi indexer.conf

input {
  redis {
    host => "kibana.pd.local"
    type => "redis-input"
    data_type => "list"
    key => "logstash"
    format => "json_event"
  }
}

filter{
  multiline {
    type => "bb-services"
    pattern => "^20(.)*"
    negate => true
    what => "previous"
  }

  multiline {
    type => "tomcat-std"
    pattern => "^(.*\|){3}\s((?!Caused by)[^\s]).*"
    negate => true
    what => "previous"
  }

  multiline {
    type => "catalina"
    pattern => "^(SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST)(.)*"
    negate => true
    what => "previous"
  }
}

output {
  stdout { debug => true debug_format => "json"}

  elasticsearch {
    host => "localhost"
    port => "9300"
    cluster => "elasticsearch-kibana"
  }
}

Shipper configuration – vi shipper.conf

input {
  file {
    type => "bb-services"
    path => "/usr/local/blackboard/logs/bb-services-log.txt"
  }

  file {
    type => "apache-access"
    path => "/usr/local/blackboard/logs/tomcat/bb-access-log*"
  }

  file {
    type => "tomcat-std"
    path => "/usr/local/blackboard/logs/tomcat/stdout-stderr*"
  }

  file {
    type => "catalina"
    path => "/usr/local/blackboard/logs/tomcat/catalina-log.txt"
  }
}

output {
  stdout { debug => true debug_format => "json"}

  redis {
    host => "kibana.pd.local"
    data_type => "list"
    key => "logstash"
  }
}

Fire up shipper and indexer (TODO: run as service)

java -jar logstash-1.1.9-monolithic.jar agent -f indexer.conf &

java -jar logstash-1.1.9-monolithic.jar agent -f shipper.conf &

Note: Used http://java-regex-tester.appspot.com/ for regex testing.

Kibana

http://kibana.pd.local

Setup Ruby

rpm -Uvh http://rbel.frameos.org/rbel6
yum install ruby ruby-devel ruby-ri ruby-rdoc
cd /usr/local
wget http://production.cf.rubygems.org/rubygems/rubygems-2.0.3.zip
unzip rubygems-2.0.3.zip
ruby rubygems-2.0.3.zip/setup.rb

Get Kibana

wget https://github.com/rashidkpc/Kibana/archive/v0.2.0.zip
unzip v0.2.0
cd Kibana*
gem install bundler
bundle install

Configure Kibana

Elasticsearch = "kibana.pd.local:9200"
KibanaPort = 80
KibanaHost = 'kibana.pd.local'

Run Kibana

bundle exec ruby kibana.rb

Chef

Warning
Not complete

Setup Chef

  1. Sign up for hosted Chef: http://www.opscode.com/hosted-chef/
  2. Setup workstation: https://learnchef.opscode.com/quickstart/

I had to upgrade ruby-build to install the target Ruby version in the Chef doc.

brew upgrade ruby-build

Chef Cookbook

We’ll probably fork this:
https://github.com/lusis/chef-logstash

 
Leave a comment

Posted by on 04/05/2013 in analytics, tools

 

Tags: , , , , , ,

Synchronizing database data to external systems without triggers

I recommended that it’s best practice to stay away from database triggers here. However, what do you do when you have to track changes that cannot be monitored on the application tier reliably? How can you accomplish a task, such as synchronizing database data to external systems without triggers?

First of all, do not use datetime columns as a sync mechanism. They can easily fail when system time is adjusted (e.g. NTP, day light savings, manual change). The usage of certain isolation level, such as read committed can also result to missing changes due to racing conditions. Another problematic example is when you restore or import data. If you rely on the system time, you have no idea whether that data has been properly synchronized in the past, especially for new features.

For Oracle, use the system change number (SCN) that’s available by default. SCN is one of the version query pseudocolumns of each row, called “ora_rowscn”. A pseudocolumn behaves like a table column, but is not actually stored in the table.
Note: Oracle tracks SCNs by transaction committed for the block in which the row resides.

select pk1,ora_rowscn from users where ora_rowscn > ?;

For SQL Server, use rowversion. rowversion is a data type that exposes automatically generated, unique binary numbers (the synonym, timestamp data type is deprecated). It is just an incrementing number and does not preserve a date or a time. Configuring tables for Change Tracking is another way but using rowversion is less stealth so it’s much more likely to be reused for other purposes.

alter table system_registry add rv rowversion;
select pk1, registry_key, registry_value, CONVERT(bigint,rv) as rv
from system_registry
where rv > ?;

For PostgreSQL (coming soon), xmin is the system column that tracks row versions. Like Oracle’s SCN, this value gets tracked by default.

SELECT pk1, registry_key, registry_value, xmin FROM system_registry
where xmin::text::bigint > ?;

Use a background task to detect changes and generate appropriate events to process in a DB table or ActiveMQ. Create triggers for delete operations only as they cannot be detected by this method. If the table size is very small and the sync operation runs infrequently, simply sending the entire table contents to the external system might be an option.

 
Leave a comment

Posted by on 03/29/2013 in database

 

Tags: , , , ,

Ehcache write-behind mode configurations

Attribute Description Default Suggested Make it configurable? Comment
writeBehindMaxQueueSize The maximum number of elements allowed per queue, or per bucket (if the queue has multiple buckets). When an attempt to add an element is made, the queue size (or bucket size) is checked, and if full then the operation is blocked until the size drops by one. Note that elements or a batch currently being processed (and coalesced elements) are not included in the size value. Programmatically, this attribute can be set with net.sf.ehcache.config.-
CacheWriterConfiguration.-
setWriteBehindMaxQueueSize().

Use the method net.sf.ehcache.statistics.-LiveCacheStatistics#-
getWriterQueueLength() to monitor the queue size. Add this data in the cache monitoring page. This method returns the number of elements on the local queue (in all local buckets) that are waiting to be processed, or -1 if no write-behind queue exists. Note that elements or a batch currently being processed (and coalesced elements) are not included in the returned value.

Once all retry attempts have been executed, on exception the element (or all elements of that batch) will be passed to the net.sf.ehcache.writer.-CacheWriter#throwAway method. We should use this to track how many throw away operations occur in our stats page.

0 unlimited 0 y
writeBehindConcurrency The number of thread-bucket pairs on the node for the given cache. Each thread uses the settings configured for write-behind. For example, if rateLimitPerSecond is set to 100, each thread-bucket pair will perform up to 100 operations per second. In this case, setting writeBehindConcurrency=”4″ means that up to 400 operations per second will occur on the node for the given cache. Programmatically, this attribute can be set with net.sf.ehcache.config.-
CacheWriterConfiguration.-
setWriteBehindConcurrency().
1 1 y
maxWriteDelaySeconds The maximum number of seconds to wait before writing behind. If set to a value greater than 0, it permits operations to build up in the queue to enable effective coalescing and batching optimisations. 0 1 y
rateLimitPerSecond The maximum number of store operations to allow per second. 0 0 y
writeCoalescing Whether to use write coalescing. When set to true, if multiple operations on the same key are present in the write-behind queue, then only the latest write is done (the others are redundant). This can dramatically reduce load on the underlying resource. false true n
writeBatching Whether to batch write operations. If set to true, storeAll and deleteAll will be called rather than store and delete being called for each key. Resources such as databases can perform more efficiently if updates are batched to reduce load. false true n Make sure storeAll and deleteAll implements Redis pipelining
writeBatchSize The number of operations to include in each batch. If there are less entries in the write-behind queue than the batch size, the queue length size is used. Note that batching is split across operations. For example, if the batch size is 10 and there were 5 puts and 5 deletes, the CacheWriter is invoked. It does not wait for 10 puts or 10 deletes. 1 100 y If I read this right, this value is interpreted as “less than or equal to”. The batch will execute even if there’s less than the specified number upon maxWriteDelaySeconds time is up.
retryAttempts The number of times to attempt writing from the queue. 1 2 n
retryAttemptDelaySeconds The number of seconds to wait before retrying. 1 1 n

Reference: http://ehcache.org/documentation/apis/write-through-caching#potential-benefits-of-write-behind

 
Leave a comment

Posted by on 03/17/2013 in java, performance

 

Tags: ,

Database Trigger Best Practices

Introduction

A database trigger is a procedural code that gets executed automatically in reaction to certain events on a database table/view. It is frequently used for enforcing relational data integrity. It is a powerful feature that must be used very carefully.

General Concerns

Maintenance and Operation

It’s easy to be aware of table relationships, constraints, indexes, and stored procedures but triggers are stealthy. They are often forgotten or ignored by developers until issue arises because they execute invisiblly to the application tier. Its “automagic” nature also makes it harder to debug in both development and production environment even for DBAs. Following the logic can become extremely complex because triggers can be fired before or after the initiating operation and they can result to chain reaction of multiple triggers. Another risk in the field is that they can be accidentlly disabled or dropped with sufficient privilege.

Performance

Row level triggers are vulnerable to bulk operation performance as they always fire for each row. For an example, even if you bulk insert 100 rows at once, your row level triggers will fire 100 events. You’d also add unnecessary work to the database system if certain conditions end up producing no change.

Best Practices

  • Triggers should be used only when you cannot do something any other way. Stored procedures, constraints, computed columns, and views should be considered first.
  • Do not put business logic inside triggers.
  • Queries inside triggers must be optimized with care. Make sure there’s no missing index and the execution plan is efficient.
  • Do not perform any operation that cannot be rolled back (e.g. send emails). Consider what happens if your trigger fires but the transaction rolls back.
  • Triggers are performed inside of an implicit transaction. Stay away from Oracle’s autonomous transactions that allow issuing of COMMITs and ROLLBACKs inside triggers.
  • When a trigger is used to enforce entity integrity, think about multiuser conditions with concurrency.
  • When a trigger is inevitable, look for a trigger loop and plan carefully. For example: you need to add a trigger on table B.col1, to update table C.col1 when B.col1 is updated. You should look at if there is a trigger on C.col1, and also look at if there is any trigger exists that would automagically update B.col1.

Process for Adding a New Trigger

  • All triggers should be reviewed by one of the DBAs and Performance team.
  • A static code analysis rule detects newly added database triggers and all DBAs are prompted to review them.

Reference

 
1 Comment

Posted by on 03/10/2013 in database, patterns

 

Tags:

Responsive and Adaptive Design Research – Test Automation

Selenium WebDriver

They can be run on real devices and in an Android emulator or in the iOS Simulator, as appropriate. They are packaged as an app. The app needs to be installed on the emulator or device. The app embeds a RemoteWebDriver server and a light-weight HTTP server which receive, and respond to, requests from WebDriver Clients i.e. from your automated tests.

https://code.google.com/p/selenium/wiki/WebDriverForMobileBrowsers

Watir WebDriver

There are three options for using watir-webdriver to test mobile sites:

  • Running tests against an embedded browser on a real device;
  • Running tests against an embedded browser on a device emulator on a desktop machine; or
  • Running tests against a desktop browser that is configured with the same resolution and user-agent credentials as a mobile browser.

http://watirwebdriver.com/mobile-devices/

Monkey Talk

MonkeyTalk is a simple-to-use tool with incredible power. Automate real, functional interactive tests for iOS, Android, Web/HTML5, and Hybrid apps – everything from simple “smoke tests” to sophisticated data-driven test suites.

http://www.gorillalogic.com/monkeytalk/

 
Leave a comment

Posted by on 03/10/2013 in mobile, web

 

Tags: , ,

Responsive and Adaptive Design Research – iframes on iOS

iframe Problems in iOS Safari

iframe Alternative for External Resources

Object:
Object also has similar problems

Cross-Origin Resource Sharing (CORS):
Browser support

  • Gecko 1.9.1 (Firefox 3.5) and above
  • WebKit (Initial revision uncertain, Safari 4 and above,1 Google Chrome 3 and above, possibly earlier)
  • MSHTML/Trident 6.0 (Internet Explorer 10) has native support. MSHTML/Trident 4.0 & 5.0 (Internet Explorer 8 & 9) provides partial support via the XDomainRequest object

JSON-P

 
Leave a comment

Posted by on 03/10/2013 in mobile, web

 

Tags: , ,

Responsive and Adaptive Design Research – Mobile Device Simulators

iPhone and iPad (on Mac)

  1. Download and install Xcode from the App Store
  2. Right click on Xcode.app and select “Show Package Contents”
  3. Drill down to Content => Developer => Platforms => iPhoneSimulator.platform => Developer => Applications => iOS Simulator.app
  4. Drag the iOS Simulator icon onto your dashboard so you’ll have easy access

http://developer.apple.com/library/ios/#DOCUMENTATION/Xcode/Conceptual/ios_development_workflow/25-Using_iOS_Simulator/ios_simulator_application.html

Android (on Mac)

  1. Download and unzip the SDK at http://developer.android.com/sdk/index.html
  2. From terminal, run
    adt-bundle-mac-x86_64/sdk/tools/android
  3. Create a new Android Virtual Device via Tools -> Manage AVDs -> New (ex. phone: Galaxy Nexus, tablet: Nexus 7

http://developer.android.com/tools/devices/emulator.html

Note: In order to access localhost on your machine from emulator, you must do so via “10.0.2.2″ IP address. Trying to access “localhost” on the emulator will try to access the emulator itself.

Microsoft Surface (on Windows)

  1. Prepare a Windows 7 machine for the current latest SDK, 2.0
  2. Download and install the latest Surface SDK at http://www.microsoft.com/en-us/download/search.aspx?q=surface+sdk
  3. Start -> All Programs -> Microsoft Surface SDK x -> Tools -> Surface Simulator

http://msdn.microsoft.com/en-us/library/ee804952(v=surface.10).aspx

Note: Virtual machines or remote desktops do not support Surface Simulator. If you install and run Surface Simulator on a virtual machine or remote desktop, you might see a Microsoft.DirectX.Direct3D.NotAvailableException exception. You can try to work around this exception by changing the attract application to a Microsoft Windows Presentation Foundation (WPF) application. However, you might encounter additional problems.

Microsoft Windows Phone (on Windows)

  1. Prepare a Window 8 machine for the current latest SDK, 8.0
  2. Donwnload and install the latest Windows Phone SDK at http://www.microsoft.com/en-us/download/search.aspx?q=windows+phone+8+sdk
 
Leave a comment

Posted by on 03/10/2013 in mobile, tools

 

Tags: , , , ,

 
Follow

Get every new post delivered to your Inbox.