Issue 33 - March - Today Software Magazine by Today Software Magazine

No 33 • March 2015 • www.todaysoftmag.ro • www.todaysoftmag.com

TSM

T O D A Y S O F T WA R E MAG A Z I NE

Augmented reality on mobile devic Our Java Chronicle in action case

Future-Oriented Technology Analysis (FTA) Interview with Jonathan Shieber, senior editor at TechCrunch Getting Started with Apache Cassandra The importance of prototyping

Quality Assurance in Agile High-performance Messaging Systems Apache Kafka O introducere și tuning-ul Hadoop MapReduce Hadoop MapReduce deep diving and tuning

Appium – automated cross-platform testing for mobile devices Developing Secure Applications in Java Spatial Data in the SQL Server

A simple approach for Risk Management in Scrum

6 Interview with Jonathan Shieber, senior editor at TechCrunch and CrunchBase Ovidiu Măţan

7 The First IT&C Cluster in Bucharest, Different Angle, is being officially launched Ovidiu Măţan

8 Startup Weekend Cluj 2015 Why come?

28 Usable Software Design Alexandru Bolboacă

31 Quality Assurance in Agile Vasile Selegean

35 Developing Secure Applications in Java

Cristina Juc

Silviu Dumitrescu și Diana Bălan

10 MVP Academy proudly presents its Class of 2015

38 A simple approach for Risk Management in Scrum

Irina Scarlat

12 Alt-tester at Mobile World Congress 2015 Simina Rendler

15 Augmented reality on mobile devices Alexandru Fediuc și Virgil Andreies

19 Hadoop MapReduce deep diving and tuning Tudor Lăpușan

22 High-performance Messaging Systems - Apache Kafka Tiberiu Nagy

25 Spatial Data in the SQL Server Diana Muntea

Sebastian Botiș

42 The importance of prototyping Cătălin Timofti

44 Our Java Chronicle in action case Vasile Mihali

46 Getting Started with Apache Cassandra Sergiu Indrie

49 Appium – automated cross-platform testing for mobile devices Vasile Pop

46 Future-Oriented Technology Analysis (FTA) Ioana Armean

editorial

Ovidiu Măţan

ovidiu.matan@todaysoftmag.com Editor-in-chief Today Software Magazine

ne of these past evenings, encouraged by a relaxing atmosphere, a friend of mine and I indulged ourselves to approach the issue of the relation between the teams of programmers and those of testers from a philosophical-religious perspective. The tester-programmer opposition, which places the programmer in the position of a creator, and the tester in the position of hair splitter, a faultfinder always in search of faults and driven by the desire to destroy what the programmer has so passionately created, can be dissolved only after each part has left its ego aside and no longer thinks of themselves as angel and, respectively, demon. It was very difficult for us to rid ourselves of these roles! The solution came from the Oriental philosophy, which places each of them in a convenient position: the programmer – tester antagonism resembles the Ying –Yang principles, which is a system whose value is higher than that of its components. But let us leave the world of ideas and get back to the twists and turns of the concrete, namely to this spring’s events. I would like to mention two of them, which have taken place recently. I am talking about … even mammoths can be Agile, the single local event that gathers the local management community. TSM was designated again in the organizers section. Our contribution was the involvement in the publication in printed form of all Gogu’s stories which have appeared in the magazine during the last three years. It was a novel project and we are glad, thus, to publish the second TSM book. The second event, to which TSM has participated as partner and the undersigned as moderator of a track is Cluj Innovation Days 2015. It has grown a lot over the past few years, shifting its focus from the political area towards product development. The theme chosen for this edition was the technological transfer. We had the opportunity to see interesting presentations from the area of research, innovation, patenting process, or the way of getting on the stock market. Another important thing to mention was the Internet of Things, a workshop held by Google. Here is an overview of this issue’s articles. We begin by a short interview given to our magazine by Jonathan Shieber, chief editor in TechCrunch, whom we had the opportunity to talk to within a workshop organized in Cluj. His vision on startups is emphasizing the essential, the differentiating factor. We continue with a short presentation of A Different Angle, the first IT & C Cluster in Bucharest. Startup Weekend is an event to which all those who wish to develop a product should participate. We are publishing in this issue some advice from those who were awarded the first prizes in the last editions. Mobile World Congress is one of the biggest IT events in Europe, where collaborators of our magazine found out news and they are sharing their impressions with us. The series of technical articles begins with Virtual Reality improved by Mobile Devices, which presents the main frameworks that help us in the creation of applications dedicated to the augmented reality. Cloud technologies are represented in a series of articles in this issue: The Introduction and Tuning of Hadoop MapReduce, Efficient messaging systems – Apache Kafka, Spatial type data in SQL Server, Java Chronicle in action and Apache Cassandra, the first steps. The duality with the testing area is maintained by: Appium – cross-platform automated testing for mobile devices and Quality Assurance in Agile. We round up in an optimistic mood with an interesting article which investigates the future of technology: Analysis for the future technologies.

Ovidiu Măţan

Founder of Today Software Magazine

no. 33/2015, www.todaysoftmag.com

Editorial Staf

Authors list Alexandru Bolboacă

Silviu Dumitrescu

Agile Coach and Trainer, with a focus on technical practices @Mozaic Works

Java Line Manager @ Accesa

Alexandru Fediuc

Simina Rendler

Associate IT Consultant @ msg systems România

Software Tester @ Altom Consulting

Cătălin Timofti

Tiberiu Nagy

UX Designer @ SDL

Senior developer @ Betfair

Cristina Juc

Tudor Lăpușan

Organizatoare @ Startup Weekend Cluj

Java & Big Data developer @ Telenav

Diana Bălan

Vasile Mihali

Java developer @ Accesa

Senior Software Engineer @ Arobs

Diana Muntea

Vasile Pop

Software Developer @ Yardi România

Software Engineer @ Intel România

Ioana Armean

Vasile Selegean

alex.bolboaca@mozaicworks.com

Editor-in-chief: Ovidiu Mățan ovidiu.matan@todaysoftmag.com Graphic designer: Dan Hădărău dan.hadarau@todaysoftmag.com Copyright/Proofreader: Emilia Toma emilia.toma@todaysoftmag.com Translator: Roxana Elena roxana.elena@todaysoftmag.com Reviewer: Tavi Bolog tavi.bolog@todaysoftmag.com Accountant : Delia Coman delia.coman@todaysoftmag.com Printed by Daisler Print House Made by

alexandru.fediuc@msg-systems.com

ctimofti@sdl.com

cristinajuc@gmail.com

Diana.Balan@accesa.eu

silviu.dumitrescu@accesa.eu

simina.rendler@altom.ro

tiberiu.nagy@betfair.com

tudor.lapusan@telenav.com

vasile.mihali@arobs.com

Today Software Solutions SRL str. Plopilor, nr. 75/77 Cluj-Napoca, Cluj, Romania contact@todaysoftmag.com www.todaysoftmag.com www.facebook.com/todaysoftmag twitter.com/todaysoftmag ISSN 2285 – 3502 ISSN-L 2284 – 8207

diana.muntea@yardi.com

ioanaa@imprezzio.com Business Analyst @ Imprezzio Global

vasile.pop@intel.com

vasile.selegean@isdc.eu QA Officer @ ISDC

Sebastian Botiș

Virgil Andreies

Delivery Manager @ Endava

Associate IT Consultant @ msg systems România

Sebastian.Botis@endava.com

virgil.andreies@msg-systems.com

Any reproduction or total or partial reproduction of these trademarks or logos, alone or integrated with other elements without the express permission of the publisher is prohibited and engage the responsibility of the user as defined by Intellectual Property Code

sergiu-mircea.indrie@hp.com Software Engineer @ HP

www.todaysoftmag.ro www.todaysoftmag.com

www.todaysoftmag.com | no. 33/march, 2015

business

Interview with Jonathan Shieber, senior editor at TechCrunch and CrunchBase

e had the opportunity to talk to Jonathan Shieber within a workshop organized in Cluj at the beginning of March. Those of the startup area, interested in exercising their pitch, had the opportunity to do this and receive advice directly from him. As we have expected, there is no recipe for success, but it is important to focus on the essential and the differentiating factors of the product, no matter if we talk about a presentation or the publication of an article in TechCrunch. Here is a short interview with Jon, on the current tendencies.

Apple Watch will be available soon, what’s your perspective on its evolution if we are thinking about the current limitations like 1 day battery, dependency of an iPhone, new versions that will make it obsolete vs. classic watches? [Jon] I’m really not the best person to opine on Apple’s iWatch. It’s not my forte. But the criticism against the iWatch for its disposability seems right to me. I don’t see a killer app that would persuade me to pick one up, but that was the initial criticism against the tablet. Every time Apple launches a new device into the ecosystem, people question its utility, and every time it eventually becomes the default gold standard in its category. This is one of those cases where it’s best to wait and see. As a senior editor at TechCrunch, there is a lot of interest in sending news to you. What advice will you give to a startup person who wants an article to get published in TechCrunch? Any important event he/she should attend? [Jon] I touched on this at the panel. Be concise, describe the pain point that the company’s technology solves, mention the news in one sentence, talk about the potential size of the market opportunity and research who’s the right reporter to be contacting. Once an entrepreneur identifies the right reporters for the news they’re announcing, then they should be persistent in contacting those people. Start the process early. If there are reporters that you respect, drop them a line and let them know. If people notice us doing our job, then

we’re more likely to notice whatever it is What is your fair opinion about that they’re doing. startups from Romania? Is TechCrunch actively involved in a startup accelerator through CrunchBase? [Jon] TechCrunch is not affiliated with any accelerator or incubator. There is a fund that was started by TechCrunch founder, Michael Arrington, called CrunchFund, but I’m not sure what the relationship is between that investment vehicle and TechCrunch (from an institutional perspective).

no. 33/march, 2015 | www.todaysoftmag.com

[Jon] Talent is abundant in Romania. I met with several passionate entrepreneurs who are pursuing interesting ideas, but the ecosystem is quite young in Romania and fairly immature. There’s a definite need for more capital and more operational talent with experience in business development.

Ovidiu Măţan

ovidiu.matan@todaysoftmag.com Editor-in-chief Today Software Magazine

event

TODAY SOFTWARE MAGAZINE

The First IT&C Cluster in Bucharest, Different Angle, is being officially launched

arch 26th 2015, Sky Tower, Bucharest – the official launching of the first IT&C Cluster in Bucharest, Different Angle, also coincides with the announcement of the first domain of common interest promoted by the members of the new organization: Smart Cities.

Starting from the premise that modern, intelligent cities are able to ensure a comfortable and sustainable environment for their inhabitants through the efficient usage of the available technology, Different Angle Cluster has invited three genuine “ambassadors” of the concept of Smart Cities to the launching event. Thus, the event benefits from the presence of Prof. Dr. Sorin Cotofana from the Technical University of Delft, Holland; Chrysses Nicolaides – Founder of the Mediterranean Smart Cities Cluster and Giora Levi – CEO, Alvarion. The case studies presented by them focus on practical implementations of the basic principles of Smart Cities in domains such as: • optimal ways of transferring knowledge, resources and experience between the providers of solutions and the beneficiaries of the urbane environments; • using the wireless networks as a basis of the development of smart cities; • the capacity of action of the IT&C Clusters in the European context. Initiated by the eleven member companies – consultancy and IT&C companies having over fifteen years of experience on the market, the Different Angle Cluster aims to rethink and stimulate the typical forms of collaboration. The medium and long term goals of the Different Angle Cluster consist in: improving the efficiency of the IT&C local potential, the knowledge transfer between the academic environment and the private one, reducing the shortage of specialized working force in the IT&C domain in Bucharest. The Different Angle Cluster is formed of the following companies: Econo-heat, eSolutions Grup, Evolva Trend Consultant, Gemini Solutions, GreenTree Applications, Lasper Human Development, Nemetschek România Sales&Support, Power Net Consulting, Qualitance, RezolvIT, Tremend. Ovidiu Măţan

ovidiu.matan@todaysoftmag.com Editor-in-chief Today Software Magazine

www.todaysoftmag.com | no. 33/march, 2015

entrepreneurship

Startup Weekend Cluj 2015 - Why come?

tartup Weekend is a global movement that brings together people with ideas, aspirations and different backgrounds, in order to help each other achieve a common goal. It helps people develop confidence, since they can see their ideas come to life almost in a blink of an eye. And it also provides them with mentorship from real life entrepreneurs that have succeeded already, and are there to help. This year Startup Weekend Cluj will have its 4th edition which will take place on 24-26 of April. We could tell you a number of reasons why it’s such a great and inspiring place to start your startup, but we decided instead to ask some of the winners of the last editions to tell you themselves. In order to do that, we came up with a few questions for them. Here are some of their answers:

idea I had, which was about building a habit of appreciating the people around you for who they are and the great things they do for you every day. I had it in my relationships with some of the most awesome people in my life and wanted to find a way to use technology to scale it. The ideas just clicked together and so ZenQ started.” Zornitsa Tomova (ZenQ) - Winner at SWCluj 2014

1. How did you come up with the idea/ concept?

2. At what stage are you now with the idea/ your startup?

”The idea was born in February 2014, in a sunny afternoon on the bench in front of Cluj Cowork where we were working at the time with ZenQ co-founder Mircea. He told me he had this idea of a place where you can get a personality profile built by your friends. You could use it for discovering your strengths, looking for jobs, generating online trust with people who don’t know you. I immediately remembered of another

”After SWCluj we needed to get to really understanding how Employee Engagement works and if the tool that we envisioned at the conference really impacts and influences engagement. It seems like it does, but we needed to see if the research supports what we believed. This is what we’ve been working on. It took a long time because Engagement is a relatively new field so the research doesn’t yet have clear cut answers.”

Young spirit Mature organization A shared vision Join our journey! www.fortech.ro

no. 33/march, 2015 | www.todaysoftmag.com

A n ton i a O n a c a ( E n g a g e m e n t Management) - Winner at SWCluj 2014

3. What are some of the biggest mistakes that you made in your startup so far? ”Not moving fast enough, not keeping in touch with all the incredible people we’ve met during events and thereby not seizing all the available opportunities.” Mara Steiu (Teentrepreneur) - Winner at SWCluj 2014 4. Are you still working together with the people you met at SW Cluj? ”Considering our current team, the answer is no. Considering some of the great, inspirational mentors we’ve had the chance to meet at SW, the answer is yes.” Mara Steiu (Teentrepreneur) - Winner at SWCluj 2014

TODAY SOFTWARE MAGAZINE

5. If you were to come to this year’s edition, what would you do different? What advice do you have for this year’s participants? ”I wouldn’t do anything different - I think I was quite lucky with how things turned out. The only advice I could give is - go to Startup Weekend not only to have fun. Go in as a winner, in your mind, as you walk in. It makes a big difference on how much fun you will have”. Zornitsa Tomova (ZenQ) - Winner at SW 2014

This was just a sneak peak. You can read the full interviews on our blog. Also, keep in mind that there is the special ”Early bird” offer available just till the end of March. You can be the winner of this year’s edition, all you need to do is grab a seat and bring an idea. Maybe it’s you who we’ll ask an interview the next year?

6. What is your most memorable experience from SW Cluj? ”The shock of meeting, interacting with so many awesome people, in such a short time, while also working on my project!” Mara Steiu (Teentrepreneur) - Winner at SWCluj 2014

Cristina Juc

cristinajuc@gmail.com Organizatoare @ Startup Weekend Cluj

Our core competencies include:

Product Strategy

Product Development

Product Support

3Pillar Global, a product development partner creating software that accelerates speed to market in a content rich world, increasingly connected world. Our oﬀerings are business focused, they drive real, tangible value.

www.3pillarglobal.com

www.todaysoftmag.com | no. 33/march, 2015

entrepreneurship

MVP Academy proudly presents its Class of 2015

ucharest, March 18, 2015 – 13 promising tech startups with global potential are part of the second batch of MVP Academy, pre-acceleration program that will take place in between March 23rd – May 14th at TechHub Bucharest. The finalists will take their products to the next level and build valuable connections in the industry by attending practical workshops, mentoring sessions and other dedicated activities. The list of teams is available online on the program website. The 13 finalists that are now part of MVP Academy Class of 2015 are building products in fields such as security, mobile commerce, ecommerce, analytics or fashion tech and have differentiated from the other applicants through the experience they already have: most of the startups in the second batch have previously founded other companies or are formed of experienced tech professionals. The selection was made taking into consideration the team fit & experience, the market size & trend, the market validation & traction, the international potential & impact, as well as the overall feasibility of the product. Bogdan Iordache (Investment Manager, 3TS Capital Partners and Board Member, How to Web & TechHub Bucharest), Cosmin Ochișor (Business Development Manager, hub:raum, that has evaluated the teams on behalf of hub:raum and Telekom Romania) and Alex Negrea (Co-Founder, docTrackr) were part of the jury that selected the finalists.

Meet the MVP Academy Class of 2015 The startups that are now part of the second batch of the preacceleration program are: 1. Accelerole: pay-as-you-go management software that helps

no. 33/march, 2015 | www.todaysoftmag.com

freelancers and agencies master the bill-by-the-hour structure & better manage their internal processes; 2. Catwalk15: mobile app that helps you get instant fashion advice and find style inspiration; 3. Clepsisoft CyberFog: proactive cybersecurity solution which can deflect cyberattacks targeting your company; 4. CloudHero: management and reporting product that empowers teams and individuals to manage their infrastructure, deploy new services and control the costs from one single place; 5. Conversion Network: integrated marketing software that enables affiliate marketers to build and scale their business with less effort and better results; 6. InnerTrends: web analytics solution for web apps and ecommerce websites focused on tracking customers; 7. MyDog.xyz: platform that interconnects dog owners, dog service providers and parks; 8. SafeDrive: mobile app that improves the traffic safety by rewarding the users that don’t use the phone while driving with points that can be converted in products and services; 9. Seeds: unified survey platform for heavy data gatherers;

TODAY SOFTWARE MAGAZINE 10. Squady: platform that helps people discover, join and create social activities in an easy and intuitive way; 11. Swapr: easy to use mobile app that allows women to exchange clothes with each other anytime, anywhere; 12. SwipeTapSell: single-page browser-based app that offers a streamlined mobile and tablet experience, helping online stores to engage and convert more; 13. Unloq: a new way to authenticate and authorize transactions that replaces passwords with devices, thus offering its users increased security, simple & free. MVP Academy is a program organized in partnership with Telekom Romania & Bitdefender, with the support of CyberGhost, Raiffeisen Bank, hub:raum and Microsoft. In between March 23rd – May 14th, the 13 teams will pass through a complex pre-acceleration process, customized to fit their specific development needs, will take their product to the next level, and will learn how to take advantage of global market opportunities. All these by attending a series of practical workshops, 1-on-1 mentoring & coaching sessions, pitching practice and other dedicated activities. During the 7 weeks of the program, the startups will build valuable connections with renown mentors and industry leaders, among which there are Jon Bradford (Managing Director, Techstars UK), Mike Butcher (Senior Editor, TechCrunch Europe), Alex Barrera (Co-Founder, Tech.eu & Press42), Ivan Brezak Brkan (Editor, Netokracija), Olaf Lausen (Chief of CEO Staff & Business Development Director, Telekom Romania) or Florin Talpeș (CEO & Founder, Bitdefender). The list of mentors that have kindly accepted the invitation to join the program is available online at http://bit.ly/MVPmentors, and it will be periodically updated to fit the specific needs of the teams in the second batch. Moreover, the teams will discuss with representatives of some of the most important accelerator programs worldwide (Techstars, Startupbootcamp, Startup Wiseguys, Ignite 100 or LAUNCHub), with angel investors and early stage investment funds that are active in the region (Early Bird, 3TS Capital Partners). The pre-acceleration program will end up on Thursday, May 14th, with Demo Day, event where the 13 startups will pitch their products and showcase their progress in front of an audience comprising investors and leaders of the tech professional community

from Europe and beyond. They will thus have the chance to start discussions for obtaining follow-up funding or closing strategic partnerships to support them in their further development. MVP Academy is a program organized in partnership with Telekom & Bitdefender, with the support of CyberGhost, Raiffeisen Bank, hub:raum and Microsoft. The visibility of the event is ensured by F6S, CrunchBase, Netocratic, Inventures. eu, Entrepreneur Global, Digjitale, Entrepreneur.bg, Newtrend. bg, Startup Date, Traction Tribe, Times New Roman, Hotnews, Capital, Evenimentul Zilei, România Liberă, Academia Cațavencu, Yoda.ro, Incont.ro, Wall-Street.ro, Forbes România, Business24, Ziare.com, Business Review, Computer Games, Comunicații Mobile, Computer World, PC World, Agora, Business Cover, Business Woman, Zelist.ro, Comunicatedepresa.ro, Trade Ads Interactive, Gadget Trends, Games Arena, Gadget Talk, Softlead, Today Software Magazine, startups.ro și IQAds.

Irina Scarlat

irina.scarlat@howtoweb.co PR Manager @ How to Web & TechHub Bucharest

www.todaysoftmag.com | no. 33/march, 2015

event

Alt-tester at Mobile World Congress 2015

obile World Congress is the largest event in the mobile technology industry: a huge exhibition, long-awaited product launches, outstanding conferences and seminars, intense networking. What is a testing services company doing at such an event and, more importantly, why is it carrying a little robot by the hand?

MWC15: The edge of innovation At one of the stands from the Polish Pavilion, the MWC15 attendees could have won various prizes by answering some IT questions. “When was the first edition of CeBIT held?” was a question I answered wrongly, assuming that an event of this type could not have been started prior to 1987. The event is however older than that: CeBIT is being held since 1970. I then returned to the MWC website page to discover that it also wasn’t a junior: MWC has been going on since 1987. And, looking around me, I could see it wasn’t small-scale either: I could spot plenty of participants in Barcelona subway stations (many of us were flaunting our badges, despite security recommendations), in the Fira Gran Via socializing venues for the event, and pretty much all around and on the way to the other event places. All this multitude of people amounted to a total of 92 000 participants from over 200 countries in the official event statistics.

What generates all this interest in the event? Looking through my tester/exhibitor/smartphone user glasses, I could definitely catch a glimpse of some of the reasons why IT people would want to attend MWC (let alone it being held in Gaudí’s Barcelona, the land of tapas and sangria). First of all, Mobile World Congress is the dazzling launching pad for

mobile technologies, services and equipment. Samsung has held here their first 2015 Unpacked event introducing Galaxy S6 and Galaxy S6 Edge. HTC has revealed the One M9, the Grip fitness band and a high-end virtual reality headset, Vive. LG has launched the Urban smartwatches, one of them running on an operating system based on WebOS. And then there were Microsoft, Lenovo, Acer, Huawei, attracting as many visitors, and particularly the media. Having the chance to be among the first to explore the newest devices, shortly after their being launched, and talk about them with the manufacturer’s representatives is an opportunity for gadgets fans, IT trendsetters and mobile app developers. I can confirm that a lot of participants took advantage of this opportunity at the event, given the continuous brownian motion in the giants’ halls. Other attractions for many MWC attendees are the conferences and seminars. This year there were more than 40 sessions held by C-level representatives of companies like Deutsche Telekom, Ericsson, Huawei, Mozilla, SAP or Wikipedia. The most anticipated session seemed to be the one that Mark Zuckerberg held, which continued the Internet.org saga (a story he started at MWC14) with the challenges that Facebook has in coordinating the Internet providers in developing countries. As exciting as the keynote had sounded from the description, as disappointing it was for some of the participants. After we take out the spectacular product launches and the long-awaited keynotes, what is left from MWC15 are only 1,900 exhibitors spread across 5 halls, each one like a horizontally distributed, reasonably sized shopping mall. It’s not an exaggeration when I say that in the participant’s guide they should stress out,

no. 33/march, 2015 | www.todaysoftmag.com

for rookies, the importance of comfortable shoes. What does a mobile exhibition look like? Architecturally, in each of the halls there are lots of stands and pavilions. Stands that look like shelves in a dressing, colorful stands, noisy or animated ones (yes, by animators; or by a variety of dynamic elements), stands built with budgets from European Union funding or from big, fat marketing budgets. Some of the exhibitors used the available space in a sober manner, limiting their product or services presence to slides, brochures. Others did it in a more glamorous way: interactive presentations, application or device live demonstrations, robots wandering across the stand, sport equipment to use while trying a wearable device; a Flamenco dancer, a barista right next to the sales representative lined up to present an offer or simply to chat with the attendees, improperly chosen palm trees, a car that was parked there to demonstrate how a driver can order food while being at the wheel. All of these to convince the attendees to stop by, try the product or ask about the service, take some promotional materials, maybe a business card, eventually to initiate a business partnership. Beyond wandering from one stand to another, there are other, more specific, opportunities to find new business partners. Such a need could not have remained unaddressed, so at MWC there are providers of B2B Matchmaking services. These

TODAY SOFTWARE MAGAZINE services are the object of activity for specialized companies, but I found out that some of the countries had officials that took care of promoting the companies from their area among the other exhibitors. This way, many MWC attendees have a meeting agenda set up beforehand.

Romanian word describing a large person or object), just to remind of its initial size; also known as Miska (Romanian for “it moves”), again, not to remove from the collective memory the iconic moment from the first demo when… it moved. Beyond the onomastic complexity, Altap is a robot integrated with a package RomaniaIT. Creative Talent, Technical Excelof automated tests that run on smartpholence nes or tablets. Or, like a visitor suggested There were 14 companies from us, on smartwatches. It extends the capaRomania that had stands at Mobile bilities we have from automated testing World Congress this year. The Romanian frameworks and it performs actions we Association for Electronics and Software Industry (ARIES-TM) with the help of the Ministry of Economy facilitates the presence of the companies at exhibitions like MWC or CeBIT, while trying to improve the perception over the Romanian IT market, from outsourcing companies to innovative ones. The exhibitors benefit from financial and logistic aid during the events. There are two criteria to meet in order to be eligible for it: the first one sustains the innovation focus, so that can’t do programmatically. companies with innovative products or The demonstration we gave at MWC15 services participate at the fair; the second was of a test written in Appium and run on one is more pragmatic and looks at having an iPad. Within the test we were verifying the taxes paid up to date. the error message we receive when we Having said that, at the Romanian try to login into the Wordpress app while Pavilion this year’s attendees could see new having the Wifi option off. While we can models of the Allview smartphone and do most of the actions and asserts from offers for IT solutions from the most pro- Appium, the step of setting the Wifi off saic domains: from the simulation of the is not programmatically doable. Here is where Altap steps in: with a stylus at the end of its mobile arm, it taps on the device screen to perform the action that normally a tester would have to do manually. This way, test scripts that run on a less permissive operating system, like iOS, can be executed without interruptions for manual actions. Alongside the animated atmosphere it created at our stand, the robot worked as a live example of the solutions we find for the testing problems we are challenged with. “Hooked” by the mobile arm, the fair participants stopped by to eventually discuss about our vision on testing and the driving exam (at the Dapredi Soft Systems way we approach it, about the consultancy from Timișoara) or an auto fleet moni- services we provide and the training sessitoring system (at our colleagues’ stand ons we organize for testers. from Arobs Transilvania), to educational Altap is the sum of the work Bogdan software (from Sphinx IT Timișoara) or Birnicu, our colleague, did for his bachemedical one (at Ropardo stand). lor thesis, the solution that Ru, another colleague, found for the image recogniAltap… it moves! tion challenge, and the collective effort the Altom went to MWC15 with Altap. Altom people made for the test design, as Among our colleagues, it is also known well as the assembly and presentation of by the internal code name Măgăoaia (a the robot. And naming it, let’s not forget

that. On the hardware part, especially the 3D printing of the components, we benefited from the help of Alexandru Popescu, PhD candidate at The Technical University from Cluj-Napoca. Without necessarily being a mobile devices and technologies promoter myself, just a smartphone user and a mobile app tester, I enjoyed participating in MWC15 to the extent that I can consider it my gre-

atest professional experience yet. On the one hand, visiting the other stands I had the opportunity to talk to professionals from related but variously different industry fields about products, services, challenges we encounter; about trends on the mobile market, what changes they bring and how we can embrace them. I watched the structure of their speech, the way they answer challenging questions, how they promote their company and how they sell their products or services. Also, I tried newly launched devices and equipment I may test on in the future. On the other hand, I had the possibility to talk to developers, IT project managers and business people about the challenges they encounter. This was an excellent exercise to address testing problems and to better understand some needs the beneficiaries of our work have. Overall, it resulted into a fresh experience of training on customer awareness; an experience I enjoy talking about. So, do ask me about it.

Simina Rendler

simina.rendler@altom.ro Software Tester @ Altom Consulting

www.todaysoftmag.com | no. 33/march, 2015

communities

IT Communities

s I was mentioning in the editorial, we had two important events in March: … even mammoths can be Agile and Cluj Innovation Days. In April, the Startup Weekend is scheduled to take place, an important event dedicated to the creation of startups. We also applaud the initiative of the Big Data meet-up group for the organization of Big Data Romanian Tour: Cluj – Timisoara – Bucuresti. Transylvania Java User Group Community dedicated to Java technology Website: www.transylvania-jug.org Since: 15.05.2008 / Members: 598 / Events: 47 TSM Community Community built around Today Software Magazine Websites: www.facebook.com/todaysoftmag www.meetup.com/todaysoftmag www.youtube.com/todaysoftmag Since: 06.02.2012 /Members: 2215/ Events: 30 Cluj Business Analysts Comunity dedicated to business analysts Website: www.meetup.com/Business-Analysts-Cluj Since: 10.07.2013 / Members: 91 / Events: 8 Cluj Mobile Developers Community dedicated to mobile developers Website: www.meetup.com/Cluj-Mobile-Developers Since: 05.08.2011 / Members: 264 / Events: 17 The Cluj Napoca Agile Software Meetup Group Community dedicated to Agile methodology Website: www.agileworks.ro Since: 04.10.2010 / Members: 437 / Events: 93 Cluj Semantic WEB Meetup Community dedicated to semantic technology. Website: www.meetup.com/Cluj-Semantic-WEB Since: 08.05.2010 / Members: 192/ Events: 29 Romanian Association for Better Software Community dedicated to experienced developers Website: www.rabs.ro Since: 10.02.2011 / Members: 251/ Events: 14 Tabăra de testare Testers community from IT industry with monthly meetings Website: www.tabaradetestare.ro Since: 15.01.2012/Members: 1243/ Events: 107

Calendar March 24 (Cluj) Launch of issue 32 of Today Software Magazine www.todaysoftmag.ro March 24 (Timișoara) Timisoara WordPress March Meetup meetup.com/Timisoara-WordPress-Meetup/ events/220966568/ March 25 (Iași) Enki.js (what I learned building a web framework) meetup.com/Iasi-JS/events/221279113/ March 26 (Timișoara) TdT#29 - the Testing Map by Claudiu Draghia m e e t u p . c o m / Ta b a r a - d e - Te s t a r e - T i m i s o a r a / events/220453273/ March 26 (Brașov) Flying Penguins: Embedded Linux Applications for Autonomous UAVs meetup.com/bv-tech/events/219375757/ March 27 (Cluj) BigData Romanian Tour : Cluj-Timisoara-Bucuresti me etup.com/Big-D at a-D at a-S cience-Me etup -Cluj-Napoca/events/220876181/ March 31(București) Mobile Advertising Congress conference-arena.com/mobile-advertising-congress April 1 (București) April BucharestJS Meetup meetup.com/BucharestJS/events/221195509/ April 14 (Cluj) UI/UX Cluj Meetup (open call4speakers) meetup.com/UXUICluj/events/220935531/ April 24 (Cluj) Cluj Startup Weekend - recommended by TSM cluj.startupweekend.org/

no. 33/march, 2015 | www.todaysoftmag.com

programming

Augmented reality on mobile devices

Alexandru Fediuc

alexandru.fediuc@msg-systems.com Associate IT Consultant @ msg systems România

Virgil Andreies

virgil.andreies@msg-systems.com Associate IT Consultant @ msg systems România

ugmented reality has entered the area of interest of consumers and, of course, that of programmers, too, along with the development of processors and graphics cards on mobile devices. However, one of the first devices that used the idea behind this technology was Sensorama, created by Morton Heilig, over 40 years ago. The device worked on similar principles, but with a more “rudimentary” manner or implementation. What made augmented reality famous was the appearance of the well-known Google Glass, and the one that managed to push the barriers further is the device patented by Microsoft, Kinect, together with the virtual headphones. I will not focus on these topics, as they belong to a different category, which I would call “still experimental”. Nevertheless, these technological “pushes” made the appearance of Augmented Reality (AR) possible also on the mobile devices. Nowadays, even a novice programmer can build such an application with the help of some powerful SDKs made available to anyone. AR is a way of augmenting physical elements by superposing them with digital content. For mobile devices, the applications make use of different sensors of it, such as the GPS, the video camera or the microphone. The industry most “affected” by this trend is that of gaming, followed closely behind by that of retail, but more and more domains are finding some use of augmented reality. Whether they are e-learning applications, which can identify texts, logos or other graphical artifices or applications which can give you information by simply placing the camera in front of a historical monument, they prove the fact that this technology is already beginning to take shape. AR creates a connection between user, the environment and the virtual world. The AR technique is that of attaching, assigning 3D or 2D pictures to the real elements by the so called “markers”. An example of visual marker is a 2D bar code reader. In addition, numerous sensors such as those of movement and tracking, recognition or image or gesture analysis sensors and, most of the times, the

GPS are used in AR.

Tracking methods

In order for the application to know where you are exactly and what it is that you are looking at (the location and orientation of the room) a calibrated video camera is needed. The system through which the location and its relative orientation are calculated is called tracking. This is one of the foundations of augmented reality. However, in order to transpose correctly a virtual object into reality, something additional is required, and this is a marker. Its role is to define the size of the virtual object as well as to recognize the orientation of the video camera. A good marker is a marker that is easy to detect in any circumstances, such as the ones based on brightness differences and not the ones based on color variations, which can become difficult to interpret because of the light variation. Many of the market systems are used on black-and-white squares in order to make a clear distinction between markers and nonmarkers. Markers can be of several different

www.todaysoftmag.com | no. 33/march, 2015

programming Augmented reality on mobile devices types: • template markers – where the match is made with the help of a black-and-white template. It is advisable to use a clearly defined image, framed by a border. • bar codes – formed, in most of the cases, from black-andwhite cells framed by a border or which come together with some graphical marks. • imperceptible markers – images, infrared markers, miniatures (markers impossible to detect by the human eye). Another way of tracking is the one based on a model. This system consists in comparing a digital model to a real object within a scene. This concept is based on the sequential analysis of a visual scene and the provision of some conceptual descriptions of the events that are occurring inside of it. In order to better understand this system, I suggest the following scenario: a street where cars pass every day and a video camera above it. First of all, it is necessary to separate the static elements from the dynamic ones, to be more specific, the segmentation of the movement. Next, there is the creation of some geometrical 3D models to superpose on as many car categories as possible and the creation of a movement pattern in contrast to the static road. Thus, we can create a scene where the cars are taken out of the context and become the object of focus.

Frameworks

There already are on the market several libraries that come to the help of programmers by giving them the possibility to invest their time more in the conceiving of the product and the software idea than in the algorithms necessary to the creation of markers and the usage of different sensors of a mobile device. Most of these frameworks are cross-platform, meaning that they can be used on several devices and systems. Of all these, three SDKs have drawn my attention and are worth mentioning.

Vuforia The platform of those from Qualcomm offer a wide range of support for different systems, providing thus the possibility of writing a native application and of making it available on a wide range of devices. It uses a technology based on Computer Vision for perceiving and tracking planar images (Image Targets) and simple 3D objects such as cuboid objects or spheres, in real time. Among the advantages, we should mention the fact that it is a free library which offers support for iOS, android and Unity 3D. 3D objects can also be created by means of code; it supports multitag, extended tracking (when the marker is no longer existent in the shot) and face-tracking and, last but not least, it functions very well with the graphic NinivehGL engine. Moreover, the tracking is much more stable compared to the other platforms. The facts that it does not have a graphical interface, that the development of an application is more difficult until you get accustomed to the platform and that you will have to write separate code for the systems (this thing, however, can be solved once you integrate it with Unity 3D) are among the disadvantages.

D’Fusion The package of those from Total Immersion has a wide range of support for most of the devices. It has a rather good graphical interface where you have the possibility to create the entire scenario. The programming part is carried out in LUA, and the android

no. 33/march, 2015 | www.todaysoftmag.com

and iPhone libraries are already precompiled, the applications built in D’Fusion being independent of the operating system. It offers support for Unity 3D and it is compatible with files from Maya or Blender. The D’Fusion Studio development platform can be downloaded for free. D’Fusion is mostly oriented towards the retail part, providing many tools in this line.

Metaio Another fashionable and very easy to use platform is Metaio. Just like the other above mentioned SDKs, this one provides support for most of the known tracking methods, too: markers, 3D models, image target etc. Important economic agents have turned to this platform for the development of some successful applications: Ikea, Lego, Audi. But Metaio does not offer tools of the “Code Once” type, therefore it is necessary to program separately for iOS and Android. Metaio shows a lot of potential, but the fact that you have to pay in order to use the framework and the existence of a rather poorly written documentation keeps many potential programmers at a distance.

The Creation of an Augmented Reality application by using the Unity 3D engine and the Vuforia extension for Unity Unity 3D What is Unity 3D? Unity 3D represents an extremely powerful 3D engine as well as a development environment for extremely user friendly interactive applications. It has the advantage of being very easy to use by people who do not possess solid knowledge in programming, as well as by the professionals. Another benefit is represented by the fact that Unity Technologies offers two versions to the developers, the free one and the Pro version, for which the user has to pay. The Pro version offers more features and some of them for the amount of 1500$. However, this price is completely justified if we consider the fact that the Unity publishing license is very permissive. For a starter, the free version should be enough. A short comparison of the two versions can be found at the address http://unity3d.com/unity/licenses , as well as the place where you can download the free version.

General features The engine uses three programming languages: C#, Boo and Unity JavaScript and it can be used to develop applications for most of the operating systems, even the mobile ones. In addition, it offers the opportunity to work directly in the 3D environment, proper for creating game levels, menus, animations, for developing scripts and attaching them to the objects. And all of these are available for only a few clicks, the graphical interface being an extremely easy to learn one. A Unity project represents a simple file which contains each resource that belongs to the game or to the interactive application.

Assets The assets represent each resource that the application uses. Therefore, under the name of “Assests” we mention the 3D models, the materials, textures, audio resources, scripts, fonts. Apart from a few simple objects, considered primitive, such as cubes and spheres, Unity does not have the possibility of creating

TODAY SOFTWARE MAGAZINE these assets. Instead, they have to be created externally by using 3D simulation applications and painting graphical tools, and, afterwards, they have to be imported in Unity. This thing is very easy to achieve, the importation being at the same time robust and smart. Unity accepts all popular file formats, including 3D Studio Max, Blender, Maya and FilmBox, keeping the materials, textures and rigging.

The Scenes The scenes represent the locations where the objects from the assets will be placed and arranged in order to create play screens. The hierarchy board represents the content of the current scene in an arborescent format.

Scripting The scripts are known as the behaviours. They ensure the manipulation and the creation of interactivity between the resources. They can be reused for several objects, their attachment to the resource being done in an extremely simple manner. At the same time, several scripts can be added on the same playing object. Example (C#): using UnityEngine; using System.Collections; public class PlayerScript : MonoBehaviour { // Use this for initialization void Start () { } // Update is called once per frame void Update () { } }

Note: The name of the class should be the same with the name of the file where it has been created. All the scripts that are attached to an object contain the start() and update() methods. The start() method is only called once, when the object is created, whereas the update() method is called once in a shot. void Update () { float horizontal = Input.GetAxis(“Horizontal”); float vertical = Input.GetAxis(“Vertical”); transform.Translate(horizontal, vertical, 0); }

Now that we have created the script, it should be assigned to the asset. This can be done by “drag-and-drop” on the playing in correlation with the images of the real world when they are object. With the assigned script, the game can be run. viewed through the camera of some mobile devices. The virtual object follows the position and orientation of the image in real Publication time, so that the user’s perspective on the object will correspond Unity can publish in Windows, OS X and through the Web to the perspective of the target image. Therefore, the virtual object Player plug-in. Web Player is a plug-in for browsers, which works will appear as part of the real world scene. Vuforia allows some with all known browsers and offers the same performance with variations of the implementation of augmented reality: the model the stand-alone application for desktop. over which this virtual world/ virtual object overlaps is an image, With Unity Pro, you can publish for a wider range of plat- a unique target called Image Target, which can even be a marker forms, including: Android, iOS, Wii, Xbox One, Xbox 360, PS3, offered by Qualcomm. Vuforia also offers the possibility of some PS4, Windows Store, Windows Phone, Flash. multiple targets. The SDK supports a variety of target types, including “markerVuforia less” targets, multi-target 3D configurations, virtual buttons using “Occlusion Detection” and the possibility to create and reconWhat should we know about Vuforia? figure classes of targets at runtime. Vuforia offers APIs in C++, Vuforia has several technologies incorporated in its SDK, Java, Objective-C and in .NET languages through the extension to which come to help the developers. Among them, there is also the Unity engine. This way, the SDK provides support both for the Computer Vision, technology through which developers can development in the Android and iOS native environment and the position and orient the virtual objects, such as the 3D objects development of AR applications in Unity. These can be equally www.todaysoftmag.com | no. 33/march, 2015

programming Augmented reality on mobile devices easy to port on several platforms, including Android and iOS. In the example described below, we will use a free Vuforia marker. The 3D object has been overlapped on the image. This object is built with a set of Blender and Photoshop tools. Through the sophisticated algorithms of Computer Vision, the features of the image are detected and tracked. The target is eventually known through successive comparisons of these features and characteristics to those of the image, kept in a data base. From the moment when the target is detected, it will be tracked for as long as it remains in the view of the photo/video camera. The creation of the targets requires access in the user account on the Vuforia site. The targets are created from .jpg or .png (RGB and greyscale) files. The characteristics are kept in a data base, being organised into sets of data.

Creation and running of an example – tutorial We are going to briefly describe, in the following lines, all the steps (some of them can be, of course, omitted through alternative approaches) in the development of an AR application. Suppose the user has installed the compatible versions of Unity and the Vuforia extension for Unity. In addition, he needs a web camera or the camera of his smartphone or tablet. Also, print the target image on an A4 sheet of paper, after having created it. After installing the tools, you will have to create an account on the Vuforia official site: https://developer.vuforia.com/user . The next step is to create the target (Image Target). Navigate to the Target Manager web application on the developer’s portal. This application allows the creation of a target data base so that they can be used on certain devices as well as in cloud. Create a data base, give it a name and assign a target to it. After the uploading of the target is complete, Vuforia runs the necessary checking and processing. Then you can download the image target. Download the file with the extension .unitypackage which contains the target. Start Unity, create a new project and import the .unitypackage files of Vuforia (the SDK and the image target). Delete the Main Camera from the hierarchy of the scene. Import now the 3D model that you wish to place over the image target. In the Project window, open the Assets/Qualcomm Augmented Reality/Prefabs file. Place the ARCamera object in the scene. With this selected object, search in the Inspector and make sure the “Load the Data Set” option with your data base (Image Target) is set as “Active”. From the same Prefabs file import the image target into the scene. With the image selected, search by using the Inspector and set the “Data Set” as image target. The previously created image should be visible in the Unity editor. By using “drag-and-drop”, add the model in the image target object of the Unity hierarchy. Use the facilities, values and moving tools on the x, y, z axes in order to fix the 3D object right in the center of the target. From now on, everything depends on your creativity. A suggestion we can

no. 33/march, 2015 | www.todaysoftmag.com

make is to place a source of light (directional light) from Unity to shed light upon the model. The example can be run with the “Play” button. Vuforia and Unity will detect the web camera and Vuforia will apply the detection and tracking algorithms and will place the object on the printed image. The application can then be ported with the help of the inner tools of Unity to run on a mobile device.

Augmented reality – why only now? I have tried in these lines to draw an overall picture of this emergent technology. It’s not like augmented technology and its implications in everyday life have just been discovered. But to overcome the problems emerging in the development of this technology (and implicitly, its involvement in more areas) needs time. These problems originate in several domains: the sociological domain – the mentality through which we see the mobile devices still as a sort of PCs, when they can be much more, the technological one – the applications of the AR type require powerful graphical processors in order to be able to overlap the 3D object(s) in real time, without distorting or interrupting it; this also means a much greater energy consumption, user interaction – the creation of some easy to use applications that can be applied in real life. An AR application has to run in real time; otherwise it will use outdated, false information. The performance of AR application for the mobile devices is completely dependent on optimisation algorithms, since the processing power and memory are limited for them. AR applications are necessary in the situations where human perception can be improved and where the usage of virtual objects in our everyday life can significantly improve our living. These applications can bring to us a new way to see and interact with the real environment and the virtual one, at the same time, an improved reality in our own pocket, too.

programming

Hadoop MapReduce deep diving and tuning

apReduce is the main batch processing framework from Apache Hadoop project. It was developed by Google and in 2004 they published an article describing the MapReduce concept.

Tudor Lăpușan

tudor.lapusan@telenav.com Java & Big Data developer @ Telenav

In 2006, Dug Cutting succeeded to implement this concept and put it into an Apache project, namely Apache Hadoop. First release happened in 14 Sep, 2007. This was the beginning of the Big Data for everyone, starting from just simple curious people to any kind of company. Soon Apache Hadoop 1 reached a very strong community and big players, as well, such as Yahoo, Facebook, Ebay, IBM, Linkedin and others2. For easy adaptation by the world, other frameworks were developed on top of MapReduce, which are much easier to learn and work with. One example is Apache Hive3, which was developed at Facebook. Because almost anyone from computer science has SQL knowledge, Facebook developed Hive, which allowed them to query and analyze their datasets by simply using HiveQL language, very similar with SQL. This way, anyone from the Facebook team with SQL knowledge had the ability to use the power of MapReduce.

MapReduce general view.

MapReduce is a distributed framework, which works on commodity hardware and it is used for data processing. It has two main phases, Map and Reduce and another phase, Shuffle, which is not so well known, but in some of use cases, it can slow down or boost your entire execution. For the majority of use cases of data 1 http://hadoop.apache.org/ 2 http://wiki.apache.org/hadoop/poweredby 3 https://hive.apache.org/

processing using MapReduce framework, the Map phase goes through the entire dataset and applies more filters and the Reduce phase is the place where we actually apply our algorithms. To better understand how MapReduce works, I recommend reading more about the MapReduce HelloWorld, the Wordcount 4 example. It simply finds out the frequency of each word from a datasets. The beauty of MapReduce is that the same code which works for a dataset of few MBs can work on much bigger ones, TBs, PBs or even more, without any code modification in our program. This is due to the nature of MapReduce distributed execution, which automatically takes care of work distribution and task failure. Bellow, you can see the pseudo-code representation of the Wordcount example. mapper (filename, file-contents): for each word in file-contents: emit (word,1) reducer (word, values): sum=0 for each value in values: sum=sum + value emit (word, sum)

In the next picture, you can see the general process of MapReduce for Wordcount execution. Each map phase receives its input and prepares intermediary key as pairs of (key,value), where the key is the actual word and the value is the word’s current frequency, namely 1. Shuffling phase guarantees that all 4 http://wiki.apache.org/hadoop/WordCount

www.todaysoftmag.com | no. 33/march, 2015

programming Hadoop MapReduce deep diving and tuning

pairs with the same key will serve as input for only one reducer, After the map phase is finished, all the spill files are merged so in reduce phase we can very easily calculate the frequency of into a single partitioned and sorted output file. It is also recomeach word. mended to compress the map output as it is written to disk to speed up the disk writing, to save disk space and also to reduce MapReduce deep dive. the amount of data transferred to the reducers. Compression First of all, the next configuration properties and steps option is disabled by default, but it can be changed very easily implied into MapReduce tuning refer to MapReduce V1. There is by setting the mapred.compress.map.output property to true. a new MapReduce version, V2, which can have very few changes. Supported compression algorithms are DEFLATE, gzip, bzip2, It is supposed to have more than basic MapReduce knowledge to LZO, LZ4 and Snappy. understand the next sections. The Reduce phase takes its input through a fetch data method As I just already mentioned, into a complete MapReduce exe- using the HTTP protocol. Let’s see what happens on the reduce cution, there are two main phases, map and reduce, and another side. phase, shuffle, between them.

Map side.

Reduce side.

After the map execution is finished, it informs the job Each Map phase receives as input a block (input split) from a tracker, who knows to which reducers to send each partition. file stored into HDFS. Default value for a block file is 64 MB. If Furthermore, the reduce needs the map output from several map the entire file size is less than 64 MB, the Map phase will receive tasks, so it starts copying their outputs as soon they are finished. as input the entire file. The map outputs are copied directly to the educe task JVM’ memory if they are small enough. If not, they are copied to the disk. When the in-memory buffer reaches a threshold size (controlled by mapred.job.shuffle.merge.percent), or reaches a threshold number of map outputs (mapred.inmem.merge. threshold), it is merged and spilled to the disk. If a combiner is specified, it will be run during the merge, to reduce the amount of data written to the disk. If we end up having multiple spill files to disk, they are also merged into larger, sorted files to save some time for later on. When the Map phase starts to produce output, it is not written When all the map tasks are finished and their outputs are directly to the disk. The process is more involved and takes copied to reduce tasks, we are going into reduce merge phase, advantage of the RAM memory by allocating a buffer where the which merges all map outputs, maintaining their sort order by intermediary results are stored. By default, the size of this buffer is key. The result of this merge serves as input for reduce phase. 100 MB, but it can be tuned by changing the io.sort.mb property. During the reduce phase, the reduce function is invoked for When more than 80% of the buffer size if fulfilled, a background each key in the sorted output. The output of this phase is written process will spill the content to disk. The 80% threshold can be directly to the output file system, typically HDFS. changed as well using the io.sort.spill.percent property. The shuffle phase means all the processes from the point Before the data is spilled to the disk, it is partitioned based where the map produces output to where the reduce consumes on the number of reduce processes. For each partition, an in it. In other words, shuffle phase implies sorting, merging and memory sorting by key is executed and also if a combiner func- copying data between the map and the reduce phases. tion is available, it is run on the output of the sorting process. Having a combiner function helps us compact the map output, MapReduce configuration tuning so we’ll have less data to write to the disk and to transfer through After we saw MapReduce internal steps and we have a better the network. Each time the buffer memory threshold is reached, understanding of them, we can now start to improve the overall a new spill file is created, so in the majority of map executions, MapReduce execution. at the end, we can have multiple spill files into a map execution. Now I’m going to give you some general advice how to tune

no. 33/march, 2015 | www.todaysoftmag.com

TODAY SOFTWARE MAGAZINE your MapReduce execution. Generally, it is better to give to shuffle phase as much memory as possible, so the data will be processed into RAM instead of disk. Because the shuffle phase is using RAM memory from the memory assigned to the map and reduce phases, we should be careful to let enough memory for the map and reduce execution. This is why it is best to write out the map and reduce functions to use as little memory as possible (by avoiding the accumulation of values in a map, for example). The amount of memory given to each map and reduce execution is given by the mapred.child.java.opts property. We should give as much memory as possible to them, but also not to exceed the quantity of RAM server memory. On the map side, the best performance can be obtained by avoiding multiple spills to the disk, one is optimal. For this, we should detect the size of map output and change the corresponding properties (e.g., io.sort.mb) to minimize the number of spill files to the disk. On the reduce side, the best performance is obtained when the intermediate data can reside entirely in the memory. By default, this does not happen, since for the general case, all the memory is reserved for the reduce function. But if your reduce function has light memory requirements, setting the right properties may boost your performance. For this, take a look at mapred.inmem.merge.threshold and mapred.job.reduce.input. buffer.percent properties.

Conclusion

If you just want to give it a try to MapReduce framework I donâ&#x20AC;&#x2122;t recommend you to bother with the above tuning because the default configuration works good enough. But if you are really working with big data sets and want to wait only for 3 days instead of 5 days for your analyze results, I strongly recommend to take tuning into consideration. Here in Cluj-Napoca, we have a strong BigData5 community where we started to have relevant topics and workshops about BigData. Just join us if you want to discover more! Next meet up6 will be the biggest one until now, we have awesome speakers from Bucharest and Timisoara, who will talk about Elasticsearch, Spark, Tachyon and Machine Learning.

5 http://www.meetup.com/big-data-data-science-meetup-cluj-napoca/ 6 http://www.meetup.com/big-data-data-science-meetup-cluj-napoca/events/220876181/

www.todaysoftmag.com | no. 33/march, 2015

programming

High-performance Messaging Systems - Apache Kafka

Tiberiu Nagy

tiberiu.nagy@betfair.com

Senior developer @ Betfair

ith the transition to event-based architectures, messaging systems have become the central component of the enterprise architecture. But as enterprises process more and more data, the performance of the messaging system becomes even more important, requiring fast, scalable solutions. Apache Kafka is a relatively new player in the messaging systems field, but it has already proven itself as one of the best-performing messaging solutions out there. It has been benchmarked to handle up to a 1 million messages per second on a 3 node cluster made of commodity hardware. Kafka was created at LinkedIn, during a period when LinkedIn was transitioning from a large monolithic database to a set of specialised distributed systems, each with its own data store. One of the challenges they faced was shipping access logs from their front-end web servers to their analytics service. They needed a system which could deliver very large data volumes to multiple destinations. None of the existing messaging solutions met their performance requirements, so they started designing their own solution under the name Kafka. The project was later open sourced and donated to the Apache Software Foundation. It has since been successfully used at various companies that needed high-throughput messaging.

Speed vs Features

The main design goal behind Kafka was to make it as fast as possible. In order to achieve high throughput, Kafka takes a novel approach to messaging, sacrificing some of the traditional messaging features in the interest of speed. One of the most important simplifications is in the way messages are retained: producers publish messages to the cluster which then become available for consumers to process, but consumers do not need

no. 33/2015, www.todaysoftmag.com

to acknowledge the processed messages. Instead, Kafka retains every message for a fixed amount of time. Consumers are free to process any message that is still available in the cluster. While this might seem suboptimal, it brings a number of advantages: • It tremendously simplifies the broker’s design---it does not have to keep track of which messages have been consumed • It completely decouples producers from consumers. In messaging systems that require acknowledgment, the performance might degrade if messages remain unacknowledged. Some systems start throttling the producers in order to protect the consumers and the messaging system’s performance. This can lead to a dangerous situation in which a slow, but unimportant consumer can severely degrade the performance of a missioncritical producer. • Consumers can be stopped at any time, without impacting the Kafka cluster’s performance. Consumers can even be batch jobs that are only periodically executed. As long as the data is retained long enough and the jobs are fast enough to process the data accumulated between executions, the system remains functional.

TODAY SOFTWARE MAGAZINE Because messages do not need to be selectively retained, Kafka can use a very simple and efficient storage model: the commit log. The commit log is an immutable sequence of messages that is continually appended to, but never modified. Because writes always happen at the end of a log, this structure is well suited to be used with conventional hard disks -- the disk would be written to sequentially as new messages arrive, avoiding costly seeks. If the consumers can keep up with the producers, Kafka can even serve messages out of the operating systemâ&#x20AC;&#x2122;s page cache, bypassing the disk completely. Another important architectural simplification is in terms of messaging patterns. Traditionally, messaging systems offered two messaging patterns: queue and publish-subscribe. In the queue pattern, a set of consumers may read from a server, but only one consumer may receive a particular message. In the publish-subscribe pattern, each message is dispatched to each consumer. Kafka provides a single abstraction over the two modes: the consumer group. Each consumer must be part of a group(even if it is the only member of the group). Within a group, only one consumer can receive a message, however, a message is dispatched to all consumer groups. This simplification allows Kafka to use a single grouping abstraction for messages: the topic. The Kafka messaging model can then be perceived as a publishsubscribe model, in which consumer groups and not individual consumers subscribe to a topic. If all of the consumers belong to the same group, the topic acts like a queue in the traditional sense---only

one consumer will receive the message. If, on the other hand, each consumer has its own group, the topic acts as a traditional publish-subscribe mechanism---every consumer receives every message. In practice, the number of consumer groups will be small, each group usually corresponding to a service wishing to consume the messages from Kafka. Withing each group, there will multiple consumers, usually one for each host running the service. Since all consumer groups receive every message, each service will receive the full message stream, however message processing will be load balanced between the hosts running the service. Message consumption within a group of 3 producers and 2 groups of consumers. m1-m5 are messages send by the producers

Architectural overview

From a birdâ&#x20AC;&#x2122;s eye view, a Kafka installation consists of a set of Kafka broker nodes and a Zookeeper cluster. Kafka uses Zookeeper for coordination and cluster management, while the brokers receive, store and serve messages. The data stored for a topic might exceed the capacity of a single broker, so Kafka further subdivides topics into partitions. Each partition is a commit log, and must fully reside on a broker. However, partitions belonging to a topic are equally distributed among the brokers, so that

each broker stores an approximately equal number of partitions.

Distribution of 4 partitions across 2 brokers

When a producer wishes to publish a message to a topic, it requests the topology of the cluster from a broker, then determines which partition to publish to (based on the contents of the message, randomly, or in a round-robin fashion), and sends the message to the broker on which the partition resides. Things are, however, a bit more complicated on the consumer side. The partitions of a topic are equally distributed between the consumers belonging to a group. Each consumer will process messages from the partitions it is assigned to. This guarantees that each message will be received by a single consumer within the group. The offset of the last consumed message for each partitions is retained in Zookeeper, so that if a consumer goes away, its partitions can be reassigned to other consumers, which would then start processing from where the leaving consumer left off. Partitioning nicely distributes the load across brokers, and thus increases throughput. But what if one of the brokers fails or becomes inaccessible? The partitions

www.todaysoftmag.com | no. 33/march, 2015

programming High-performance Messaging Systems - Apache Kafka on that broker would then be unavailable, or---depending on the type of failure--lost forever. To protect against this, Kafka introduces the concept of replica partitions. Each partition (from now on, called leader) will have a number of replica partitions. The replica partitions are always stored on a broker different from the one hosting the leader partition. A producer can never publish to a replica partition. Instead the broker holding the leader partition automatically publishes any messages it receives to the leaderâ&#x20AC;&#x2122;s replica partitions. This way, the replica partitions always contain the same data as the leader partition. If the broker holding the leader partition fails, a replica partition is automatically promoted to leader, so the cluster continues to operate normally. When the failed broker is brought up, it re-syncs its partitions from the current leader, after which an election is held and partition leadership is re-assigned among brokers.

Like any technology, Kafka has its set of limitations. The impossibility to re-consume individual messages can be a major drawback for certain type of applications. Another limitation is in terms of tooling and support. Kafka has only been around for a few years, so it does not have as rich of an ecosystem as other messaging solutions such as ActiveMQ or RabbitMQ. Kafkaâ&#x20AC;&#x2122;s reliance on Zookeeper can also be financial or administrative disadvantage, as it increases the number of machines that must be provisioned and maintained. Hopefully, some of these limitation might go away as the technology matures and gains wider adoption.

Conclusions

Kafka can be a good solution for applications that require a high throughput, low latency messaging solution. Itâ&#x20AC;&#x2122;s speed, simple design and flexible messaging semantics make it an ideal fit for use cases such as log and metrics aggregation, stream processing, event sourcing.

no. 33/march, 2015 | www.todaysoftmag.com

programming

Spatial Data in the SQL Server

patial data is used to represent information about the location and shape of geometric objects. These objects can be the center point locations or more complex structures: roads, rivers, cities or countries.

Diana Muntea

diana.muntea@yardi.com Software Developer @ Yardi România

Beginning with 2008, the variety of SQL Server products from Microsoft offers support for geospatial data. This allows storing of spatial data types within tables as points, lines, and polygons. It also offers a large variety of functions for managing the data, as well as spatial indexes to run queries more efficiently. In the SQL Server, spatial data can be of two types: • Geometrical (geometry) – data that is represented in an Euclidean system (flatearth, 2D) • Geographical (geography) – data that takes into account the curvature of the Earth and is represented using an ellipsoidal system (round-earth, 3D, 4D). Both types of data are implemented using .NET common language runtime (CLR). The two types usually behave in a similar manner, but there are some differences as well: • The way two points are connected – this connection is represented as a line in the case of geometrical data types, but as a circular arc in geographical data types.

• Measurements in spatial data types – in the planar system, distance is measured using the same measurement system that is used for representing the coordinates of the points. The distance, as a number of units, will always be the same no matter the measuring system that is used. In the ellipsoidal system, that takes into account the curvature of the planet, the coordinates of the points are represented using latitude and longitude, whereas for distances and surfaces the system usually employs meters or miles. The measurement system also depends on the SRID identifier. • Spatial data orientation – orientation is not relevant for geometrical data types. However, geographical data types have no value if we don’t specify orientation, because we will never know if they belong to the northern or the southern hemisphere. In 2014 SQL Server, all geographical data instances must be located in one hemisphere only. Moreover, the result of an operation between two geographical objects (intersection, union, difference) must belong to only one hemisphere.

www.todaysoftmag.com | no. 33/march, 2015

programming Spatial Data in the SQL Server SRID - Spatial Reference Identifier – corresponds to a spatial reference system based on the specific ellipsoid used for either flat-earth mapping or round-earth mapping. The identifier is defined by the European Petroleum Survey Group (EPSG) standard. A column may contain objects with different SRID, but we cannot perform operations between objects with different SRID (not based on the same unit of measurement, datum, and projection). The most common measurement unit is the meter or the square meter. For geometrical data, the implicit value for SRID is zero and for the geographical ones, it is 4326 (it is also used by Google Maps API). Available objects for the geometrical and geographical data types

Examples Geometrical Data Type CREATE TABLE myTable ( id int IDENTITY (1,1), geometryData geometry, GO INSERT INTO myTable (geometryData) VALUES (geometry::STGeomFromText(‘LINESTRING (100 100, 20 180, 180 180)’, 0)); INSERT INTO myTable (geometryData) VALUES (geometry::STGeomFromText(‘POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))’, 0)); GO SELECT @geom1 = geometryData FROM myTable WHERE id = 1; SELECT @geom2 = geometryData FROM myTable WHERE id = 2; SELECT @result = @geom1.STIntersection(@geom2);

Geographical Data Type CREATE TABLE myTable ( id int IDENTITY (1,1), geographyData geography, GO

Picture MSDN

SQL Server has several functions and methods that allow us to manage spatial data types: for importing data objects (STGeomFromText, STGeomFromWKB), for making different types of operations (STContains, STOverlaps, STUnion, STIntersection) or for making different measurements (STArea, STDistance), including methods to help us identify the nearest neighbor (STDistance(@me)). Starting with SQL Server 2012, FullGlobe data type is defined: it represents a polygon that covers the entire globe. This polygon has an area, but it has no boundaries.

no. 33/march, 2015 | www.todaysoftmag.com

INSERT INTO myTable (geographyData) VALUES (geography::STPolyFromText(‘POLY GON((-73.9998722076416 40.726185523600634,74.00708198547363 40.73860807461818,73.99824142456055 40.7466717351717,-73.97326469421387 40.74628158055554,-73.97309303283691 40.7269010214160, -73.9998722076416 40.726185523600634))’, 4326));

What type of data should I choose for my application? (Geometry vs Geography)?

The type of data you choose depends on the application and its purpose. From the point of view of data storing, there is no difference between the two types of spatial data. But if we check performance levels, geometrical data queries are much faster. In the end, the most important argument is functionality. If we have an application for measuring the distance between different locations, or other operations where we need to take into account the shape of the Earth, we will need to use geographical data. In other cases, for example if we only need to visualize different polygons, geometrical data might be enough.

TODAY SOFTWARE MAGAZINE Applications Radius search Let’s assume that we have a collection of points, determined by latitude and longitude, representing different locations. This type of search implies drawing a circle, determined by a center and a radius measured using a certain measurement unit (meters). In this case we can only use geographical data given that the search criterion is the distance between two points. The optimum option would be saving the points in three columns: latitude, longitude, and geographical point. This way, before applying spatial filtering, we can filter the data by using the bounding box of the circle.

References

• https://msdn.microsoft.com/en-us/library/bb933790.aspx • h t t p : / / e n . w i k i b o o k s . o r g / w i k i / Geospatial_Data_in_SQL_Server • h t t p : / / w w w . d o t n e t s o l u t i o n s . c o . u k / working-with-spatial-data-in-sql-server-2008/ • h t t p s : / / d e v j e f . w o r d p r e s s . c o m / 2 0 1 3 / 0 1 / 1 2 / geometry-vs-geography/

Geographical point: geoPoint = geography::STGeomFromT ext(‘POINT (-96.8501 32.7639)’, 4326)

www.todaysoftmag.com | no. 33/march, 2015

programming

Usable Software Design

n a previous Article for Today Software Magazine - 4 Ideas To Apply For Better Software Design, I wrote about the fact that we tend to perceive software design as userless.

Alexandru Bolboacă

alex.bolboaca@mozaicworks.com Agile Coach and Trainer, with a focus on technical practices @Mozaic Works

no. 33/2015, www.todaysoftmag.com

Whenever we talk about design in other domains than software, we discuss it from a user-centric point of view. Apple’s products are renowned because they focus on the experience of a user with their device: how it feels, how it looks, how fast it responds, the sounds it makes etc. Software design is the only type of design that seems to be userless. After all, the end-user has no idea how the software works and doesn’t care. All they do care about is to work fine. Software design is not userless. The user is the developer that will have to change the code after you do. If you have collective code ownership (like most Scrum teams these days), you’d better consider user-centric software design. Ideas such as “Clean Code” touch this idea. But I’d like to explore this topic in detail.

1. New Developers Working On Usable Software Design Will Get Productive Faster

Why is usability of web applications such an important topic today? I would argue that it brings a competitive advantage because users find it easier to start and use an application that’s built with the user in mind. No user has time to spend days learning a new product; we want to start using it immediately and get instant benefits. The new users of software design are the new developers joining your team. We will assume they know the programming language, the main framework used and have worked in the business domain before. How long does it take for them to become productive in your environment? The time spent familiarizing with the design of the application and with how things are done mostly translates to

TODAY SOFTWARE MAGAZINE expenses. In the economy of the product, that’s waste.

2. Most Common Tasks Get Done Faster With Usable Software Design

Think about the type of work you did on your current product. Chances are some of them are repeated patterns. In the eHealth application we’re developing, the first few features took a while to get done (NB: we were also learning a new technology at the same time). By carefully looking at what slowed us down and adjusting the design, we optimized the development and got to the point where about 60% of the work is the UI/UX design. The development was no longer the bottleneck. We then looked at optimizing the UI/UX design, but that’s another story. The key to this improvement was that, looking back, we realized that the features we develop fall into several types of work: • Add a new entity related to patient’s medical situation with create, display, change and hide • Link a medical entity with a journal entry • Display a historical view filtered by various criteria • etc. Since we know from that roadmap that more work like that will come, we started optimizing for these types of work. Occasionally, we have to do a new type of work that takes longer. One example was a drug search service that is fast, scalable and easy to update to the latest version of an official drug database. We had to learn and use vertx and mongodb to do it, and it took about 3-5 times as much as the usual tasks. Since that is a local situation that is unlikely to repeat, we did nothing to optimize for it. The point is this: like an application that’s easy to use for most common tasks, usable software design allows fast implementation of most common types of features. These are the main benefits I see for usable design. But how to achieve it? The first thing is …

3. Measure and Improve

• define & implement the changes • repeat

never seen the product (or the part of the product you want to test) • Ask them to perform those tasks • Measure the time it takes them to do it. Write down where they get stuck. • Use the feedback to improve the product. • Here are some examples of common tasks for a web application: • add a new form with one text field and a save button • add more validations to a field • change the text on a label (for one language or all supported languages) • add a new business rule • display a list of entities in a page • etc.

We’re using a Kanban / XP process, so we used the cycle time distribution diagram to identify the outliers. We have a recurrent retrospective every two weeks where we discuss the impediments and identify potential solutions. The implementation was made in the next two weeks, and we kept an eye to the cycle time distribution in the next months. It was easy to see the improvement since most items moved to the left. In a Scrum context, teams don’t measure cycle time, only velocity. The trouble is that the velocity is an aggregated indicator for all features implemented during the sprint. Therefore, Scrum teams have two Here are a few important things to options for getting to more usable software know about usability tests: design: • Make sure to tell the participants • Quantitative: start measuring the that if they don’t know how to do actual time spent by user story something, it’s the design’s fault and not • Qualitative: run a recurrent retrostheirs. Encourage them to ask questions pective on the topic of usable design. when they get stuck. Ask developers what takes them longer • A complete test with one person than it should. In a team where there’s shouldn’t take longer than 1 hr. trust and transparency, you will imme• Start with the most typical tasks first, diately identify the issues. and with as little information as possible. Only offer information when the This is the basic method to obtain person gets stuck or asks for help. more usable software design. The advanced • Prepare about 10 tasks, but be prepamethod is taken from usability practice… red for partial results. • So we know how to identify the pain 4. Run Usability Tests On Your Software points. I’m sure your next question is…

The process we used for making our design more usable was quite simple: Design • measure how long it takes to impleUsability tests can be ran in multiple ment each feature ways. There’s one format I found most fit • d i s c u s s t h e o u t l i e r s i n t h e for software design: retrospective • Write down a list of tasks the user • identify the root cause in the design has to perform that prevents us from going faster • Bring in a room users that have

5. How Does Usable Software Design Look Like?

Let me start by saying that the idea of focusing on developers as users of software design is very new. I have seen work around this topic, and I’ve done some myself;

www.todaysoftmag.com | no. 33/march, 2015

programming Usable Software Design past literature on design has touched this topic without making it explicit. There is however a lot of literature on usability. I will only state three basic principles of usability that apply to software design: • Clarity • Consistency • Minimize the unexpected

you can clearly see. This is a violation of the third principle, minimizing the unexpected. Consistency: Each Type of Class Should Have a Consistent Interface

We’ve seen above that a feature namespace consists of three types of classes: a request class, a controller class and a view class. There’s an additional level of consisHere are some direct applications: tency that you can reach, specifically the interface of each of these types of classes. In this example, all the Request classes above have one method: response(). All controllers have one method: render(). Each controller uses a view to render the Clarity: Name Namespaces Based On information. This is consistent across feaFeature ture namespaces. Here’s a screenshot from an application I’m developing. Can you tell what it does Final Thoughts based solely on the namespaces? Usable software design will bring two I f irst he ard Sandro Manc us o main economical benefits: talking about this idea at I TAKE • faster implementation time for most Unconference 2014, and I was very common tasks interested to give it a try. I see it as a very • and easier integration of new people good start for usable software design. in the project. To obtain usable software design, we need to get feedback from its Consistency: Have a Consistent users, namely the developers. There are Structure For Each Feature Namespace two ways of doing it: through retrospecHere’s how the namespaces look when tives and by running usability tests. expanded: This idea is not entirely new. Principles such as clarity and consistency have been used for many years to obtain better design. The idea of usable software design is however a change in perspective; thinking of the developer as the user of the software design and actively trying to get feedback from them will bring forward changes in the way we’re organizing our code. Each of them contains three things:a request class • a controller class • a view class I still have to figure out a better place for the InvoiceFileNameGenerator, as

no. 33/march, 2015 | www.todaysoftmag.com

Acknowledgments

I have been inspired in writing this post by many conversations with: Sandro Mancuso, Samir Talwar, the attendees at SoCraTes UK, Rebecca Wirfs-Brock, Adi Bolboaca, Claudia Rosu and many others. For more in-depth approach of Usable Software Design, I kindly recommend you the Usable Software Design Workshop @ I T.A.K.E. Unconference 2015 http://2015.itakeunconf.com/sessions/ alex-bolboaca-usable-software-design.

management

Quality Assurance in Agile-SCRUM environment

Vasile Selegean

vasile.selegean@isdc.eu QA Officer @ ISDC

magine you need a paint job for your beloved collection car, or new house. You hire the best contractors, instruct them to use the best supplies on the market, give them the best tools and pay them a little more than the market average. They finish the job on time, within the budget. Everyone is happy and you may throw in a little party. But, within a month or two, small rust stains become visible or little cracks start to develop. What went wrong? The best team you hired did not allow for the foundation layer to dry properly, according to the supplier instructions, and started to apply the paint after 4 hours, instead of 6. What is missing from this scenario is a proper work process defined and monitored. It was not about skipping an entire phase of the process, something that could have been spotted on time for fixing before the delivery of the final product. It was just a small “shortcut” that may look like an improvement for the moment but, on the long term, proved to be more expensive than the small delay that would have happened otherwise.

What is Quality Assurance?

“Quality Assurance” is a set of activities often overlooked in SCRUM teams. The reasons/excuses vary, from tight schedule and the need for a fast delivery, the quality of the test activities or, worse, not understanding the concept at all! Yet, this is not something that should be so easily put on hold, or even postponed until the time is right, as it may never happen. And the consequences could be significant, ranging from time spent fixing (on your expense) after the delivery to losing the trust of your customer. Furthermore, Quality Assurance is not to be confused with “Quality Control” – the latter being the final step in the QA process, as it needs a functional, potentially shippable, product to be sure about its effectiveness. Let’s start with the definition of quality: quality of a product (either an airplane, a

hotdog, a service or a piece of software) is a distinctive attribute of that product, when compared to another, similar product! Sometimes we say that a product A is a “quality product”, but that makes sense only when compared against other similar, available, products. More often we use expressions like “product X has a better quality than product Y” – and that makes more sense according to the definition above. In the end, quality is delivering according to the customer’s expectations. Not less (for obvious reasons), but not too much either, as you may not get paid for the extra effort. The quality of a product or a service is the sum of the various quality criteria, either clearly defined by all parties involved at the beginning of the product lifecycle or (sometimes) implied without written agreements. Such attributes are (or should be) the result of a rigorous analysis (and understanding) of the user needs. These needs will be expressed as “quality factors” that will be later built into the final product as the “quality criteria” mentioned above. Those quality criteria have to be measurable and translated into quality metrics that will be used later on to assess the quality of the product (during its development lifecycle and at the moment of

www.todaysoftmag.com | no. 33/march, 2015

management Quality Assurance in Agile-SCRUM environment delivery) and also as a baseline for further improvement. The quality of a product is a mix of everyone and everything that contributes to the development of the aforementioned product: the people doing the work, the technology (tools) that are used and the processes that are followed for the entire product development lifecycle. Quality Assurance aims to find and implement a perfect balance between those 3 “ingredients” mixed for a successful result. Ultimately, quality is what differentiates your product when compared to other similar products. Most likely there are other organizations out there that can provide the same level of expertise on a certain field; some of them are developing a very similar product, some of them cheaper, some faster (or both). For the customer of the year 2015 there’s nothing that can stop him from grabbing your competitor’s product, so the quality built into your product can make the difference. It does not have to be perfect -there’s no such thing as zero-defect product- just to fit perfectly to your customer’s needs and be a little better than other’s. Peter Leeson –CMMI and Process Improvement L ead Appraiser and Instructor- summarize this in his article “Understanding Quality”1: Understanding quality is the most critical aspect of your job, whatever it is. Quality is what differentiates your products and services from the others. If your sole focus - as reflected by measurements is quick and cheap, you will lose the battle: there will always be someone cheaper than you. Quality Assurance is the set of activities aiming to build quality into the final product. It involves every participant in the lifecycle of a product’s development, at any level (regardless of seniority or the level he/she is in the organizational structure). As long as someone can influence the development of a product and/or service, he/she is responsible for the quality of the result. Quality assurance activities can be split into two categories. First aims to prevent the defects that would arise before the development cycle ends, by defining a standard way of working, checkpoints and reviews. Measurements of various KPIs, analysis of the results, taking corrective and 1 www.qpit.net/blog/understanding-quality.html

Fig. 1: Role of the QA in a project landscape

improvement actions as well as regular assessments of the way of working (at all levels) are part of this process. The second category focuses on defect detection into the completed product and is better known as Quality Control. Of course, all this comes with a cost. In the early phases of a project one needs to plan every QA-related activity: identify customer needs, translate them into quality objectives, define measurements, plan risk mitigation and contingency, plan to monitor the progress, define/identify the best work process for the product to be developed and so on. During the project lifecycle there should be regular monitoring of the performance and quality currently under development: checkpoints planned; plan work product and document review frequency; assess the progress (or the regress) for the period since the last checkpoint and take the appropriate corrective actions if necessary; assess the rapport of the costs vs. value of the activities taken (i.e. cost of review vs. cost of fixing a defect discovered after the product is launched) and adjust the work process to best fit the needs of the project. Furthermore, the results of the various measurements collected throughout the project lifecycle, together with the proven (!) effective practices are inestimable assets of your organization. These can be used as a starting point for development of yet another successful product or when organizing a new team. Having a set of good practices, validated in time, will help you integrate new capacity into the existing environment, either a new team member or a new tool or a new process. Bottom line, the Quality Assurance

no. 33/march, 2015 | www.todaysoftmag.com

activities are vital components of any product development process. It can help your team to stay on the agreed track and, based on measurements, guide back if necessary. It is similar to the GPS device that everyone uses these days. It gives you a route to follow and estimates the time and the cost of the journey. Still, it will not force you on the road, but it will suggest you alternate routes or provide guidance if you wish to get back on a previous trail.

How does this fit into an Agile environment?

The first statement of the Agile Manifesto states that those who embrace this philosophy values “Individuals and interactions over Processes and tools”. So, adding a new process into an Agile environment does not seems like the right thing to do. Right? Let’s see! Further reading the Agile Manifesto, it is clearly stated: “That is, while there is value in the items on the right, we value the items on the left more.” Or go to the “History” page of the Agile Manifesto, written by Jim Highsmith2: The Agile movement is not antimethodology; in fact, many of us want to restore credibility to the word methodology. We want to restore a balance. We embrace modeling, but not in order to file some diagram in a dusty corporate repository. We embrace documentation, but not hundreds of pages of never-maintained and rarely-used tomes. 2 http://agilemanifesto.org/history.html

TODAY SOFTWARE MAGAZINE Check the Twelve Agile principles 3: nothing “stops” you to define a work process to best suit your needs and make use of what you’ve learnt. I assume nobody ever said he values collaboration over quality. And, if someone ever said that, he is most likely out of business by now. Could it be that Quality management processes do not go against the Agile principles? Is it possible to define work processes for different stages of the project lifecycle and different roles involved in the product that is developed, so that these processes will actually HELP a Scrum team in its day-to-day activities? Scrum (most used Agile ‘flavour’ these days) is a model, a framework for the software development process. It emphasizes the collaboration between team members and between the team and the customer, aiming to deliver faster and have better response to new or changed requirements. Ideally, the team working by the Scrum model is self-organized and mature. Every role is filled, everyone knows his/her responsibilities and they know what they have to do (and they do it without any hesitation) plus they adapt quickly to the changes in their surrounding landscape. This makes me think of a small Special Forces military unit instead of a group of computer nerds (irony intended). Unfortunately the maturity level required for a Scrum team to be effective is not something that builds overnight. Your team must have a minimum understanding of the business they’re addressing to (so the product they build will solve a problem for that business) and your team members must genuinely trust each other. They must be allowed to try and fail in order to come up with the best solution for the issue they are currently facing. As it was described above, everyone in the team plays an important role in 3 http://agilemanifesto.org/principles.html

building quality into the final product. The common pitfall is that the responsibility is diluted among the team and is often taken as granted (“we’re experienced professionals; we know what we have to do”). Also it is human nature to focus on what you’re good at / paid for. Developers develop and testers test – that’s what we’re here for. Why should a developer check the test scripts or cases? Not a “syntax check”, but to make sure their work fulfills the requirements or works as expected beyond the happy flow! Everyone ever questioned the project manager role / activities in an Agile environment? You have the Scrum Master to manage the team, isn’t it? There is a product owner responsible for the product, why the requirement engineer? And the discussion could go on forever. Quality Assurance can help Scrum teams to clarify the goal and stay on track to actually reach that goal without too much deviation. It also helps clarify the responsibilities of every role involved into the product development lifecycle. Defining a clear way of doing things, no matter it is about development, handling issues or setting up proper monitor and control mechanisms will reduce the risk of making the same mistakes again later on or when approaching a new project. When there’s a clear, measurable, path to the goal you’re trying to achieve it’s easier to observe when things are not going as planned. Any deviation will be noticed before it turns into an issue and improvements or counter-measures can be taken on time if necessary. It is a failsafe mechanism, a fuse box that can prevent or drastically limit the damage that could happen in some cases. A perfect design or system or development process is yet to be invented. Indeed, sometimes the cost of implementing a quality management system is higher than the benefits of having it (think of a really small project, that can be done quickly and the

cost of the eventual defects can be easily supported). In this scenario the organization can rely on the experience of the team, but let’s make this the exception, not the rule we’re living by. By constant (or at least periodical) review of every aspect of the product and project lifecycle, your work process will become a living organism that evolves in time. Scrum methodology focuses on the software development process, but that’s not sufficient for a successful product. As someone gets to know both worlds (Agile/Scrum and process-driven methodologies) there’s an “a-ha!” moment when one realizes that those worlds are not mutually exclusive! You can define a work process that is “alive”, that “evolves” in response to changes, based on Agile principles. Or, if you choose Scrum as you “primary weapon” for development, you still have to make sure that you have a good quality management process or that the requirements completely covers the business needs. This mixed approach is not new and not difficult to implement either! The most difficult challenge is to change the mindset of everyone involved, make them see beyond their current task or short-term goal. To make everyone aware that the ultimate goal is not to complete all tasks in time for demo but to build a quality product that fulfill a business need and can be remembered in a couple of years. And make sure the experience gained during the journey is not lost and can be used for your next quest. Let’s hear from Peter Leeson again4: Over the years, different terminologies have come into existence, which are considered as the new way of doing things. In theory, it means that people have identified the weaknesses of the 4 http://www.qpit.net/blog/getting-started-101-process-agile-or-lean.html

www.todaysoftmag.com | no. 33/march, 2015

management Quality Assurance in Agile-SCRUM environment way they are working and are therefore trying to find a new, more successful approach. In reality, it appears that the weaknesses due to misunderstanding and misapplication of basic principles have led to results which are very different from what was originally expected. We then get a group of people who believe that the new approach is the solution to all their previous problems and start following it with religious fervour, throwing out anything which does not correspond to the new vocabulary and focusing only on applying what they have understood from what they have read in a book - soon they are producing the same mistakes as previous generations and it becomes time for someone to re-create the basic ideas... In conclusion, as in most aspects of everyday life, the truth is somewhere in the middle. There’s no absolute truth and no group that can claim having the right answer to any question. There’s no such thing as a “silver bullet” or universal cure to every challenge we’re facing in our daily routine.

What is the approach on quality?

ISDC answered the processes versus Agile dispute by following the middle path described above. More than five years ago a workgroup of several enthusiasts was created, with full support from the management. Their goal was to develop quality and continuously improve our way of working. Every successful project was analyzed and a set of good practices was extracted from the experience gained through the years. These practices were refined and grouped into specific areas, covering all aspects of a project’s lifecycle, from presale to project closure. Today there are 24 process definitions, with approximately 300 tasks described in detail (who will do it? how should it be done? when? plus input and output criteria). They defined KPI’s and measurements procedures and used these measurements to adjust the process definitions. Needless to say, the group is still working today, analyzing measurements, answering questions and implementing improvement suggestions received from the organization. The processes defined became an internal standard within our organization and work began on disseminating the previous experience (synthesized into the process definitions) back into the organization.

As there’s impossible to define a onesize-fit-all process for any work areas in all projects that come in, a set of guidelines for tailoring was created, to adjust the process definition to the specific project’s needs. These guidelines are also under continuous improvement and constant review, making them and the process definition documents better and better every day. A new team was created: the QA Officers team, led by a Quality Manager – to support the project team in defining the quality management plan, set up the quality objectives for their product and define a standardized way of working throughout the project lifecycle. QA Officers observe the everyday work processes within a project and suggest improvements (based on organization’s process definitions) or promote good practices to the organizational standard definitions, so anyone can benefit from that experience. QA Officers are an independent group, thus ensuring the objectivity of the assessment of the performed work processes in every project. In addition to this evaluation, the QA Officers Team identifies and documents non-compliances, provide feedback to the project team and management on quality and performed work processes and it makes sure the non-conformities are addressed in a timely manner. As QA Officers are working close to the project teams they are the first line promoting the internal standards to the project teams and also help collecting feedback for future improvements. This internal standard was built following the “Capability Maturity Model Integration for Development” (CMMI-DEV v1.3) created by the Software Engineering Institute, a non-profit research center at Carnegie Mellon University (www.sei.cmi. edu). Our internal practices were appraised and certified at Capability Maturity Level 3 by the SEI standards – ISDC being one of the very few organizations in Romania to achieve this level. Quality Assurance is a constant concern in ISDC. Beside continuous improvement of the defined processes and tools there are dedicated sessions on “Continuous Improvement” activities for employees and QA Officers are having regular (formal and informal) discussions with everyone having questions on this area. Our external consultant visits us at least once a year and training sessions are organized for anyone interested. Although the main development methodology is Scrum there are processes defined and used for the entire project

no. 33/march, 2015 | www.todaysoftmag.com

lifecycle! To name a few: project planning; risk management; requirements development and management, release and configuration management, quality management and so on. By doing this we reduce the probability of small cracks to appear in our final product by skipping an apparently minor task that backfire in the worst possible moment. That probability still exists – but it is up to us to make sure we did as much as possible to keep it at the lowest level.

Bibliography: 1.

2. 3.

CMMI® for Development, version 1.3 – Improving processes for developing better products and services, November 2010,Technical report http://agilemanifesto.org/ http://www.qpit.net/blog.html - a series of articles by Peter Leeson, CMMI and Process Improvement Lead Appraiser and Instructor at Q:PIT Ltd.(http://www.qpit.net/contactus.html) ”Quality Assurance - Making Process Work” – Peter Leeson in ISDC, May 2014

programming

Developing Secure Applications in Java

e will begin this article by some general considerations regarding security. Thus, the aim of computer security is to protect the information stored on them against theft, corruption or natural disasters while accessing it.

Silviu Dumitrescu

silviu.dumitrescu@accesa.eu Java Line Manager @ Accesa

Diana Bălan

Diana.Balan@accesa.eu Java developer @ Accesa

Security must be understood as a comYourself (DRY). promise solution. For instance, the best way • Limiting preferential claims: if the to create a completely secure application code operates with limited privileges, then on the Internet is not to connect it to the default exploitation is most likely to be Internet. prevented. One of the most important aspects of • Trust boundaries: establishing the security is confidentiality, which represents limits between the different parts of the hiding the information sources. The mechaapplication. For instance, any data that nisms used for ensuring confidentiality are: come from a web browser into a web appliencrypting, using passwords and access concation must be checked before being used. trol (giving access to resources to a limited • Security checking: carrying out secunumber of people). rity checking in a few defined points and Another aspect is integrity, which means returning an object that the client code that the data is protected against unauthoretains, so that there will be no further rized alterations. This is usually ensured by need of subsequent checking. authentication. The user must provide cre• Encapsulation: using interfaces; the dentials (username and password). Moreover, fields should be private and accessories the detection systems should be used in case should be avoided. the authentication system fails. This system is made of access logs and analysis patterns. Types of security threats A last aspect is availability, which repreWe can divide the threats into the sents the ability to use a system or a resource following categories: when needed. • Injection and inclusion The easiest way in which a system is vul• Resource management (buffer overnerable is represented by the attacks of denial flow, denial of service) of the services. They block the user’s access to • Private (confidential) information the system or reduce the performance level • Accessibility and extensibility of the system. The system should be so fle• Mutability xible as to detect these attacks and respond to them. Injection and inclusion represent an attack which determines a program to Security aspects at software level interpret the data in an unexpected way. Any system containing private informa- Therefore, any data coming from an uncertion is very likely to become a target for the tain source must be validated. The main attackers. Some of the fundamental concepts forms of attack are: of security are: • Cross-site scripting (XSS) • Secured APIs: it is much easier to • SQL Injection conceive a secure code right from the • OS Command Injection beginning. The attempt to secure existing • Strings formatted in an uncontrolled code is difficult and it generates errors. manner • Duplication: duplicate code may not be consistently treated on copying. The XSS vulnerabilities appear when: Furthermore, it also breaches the Agile • data coming from unreliable sources programming principle, Don’t Repeat enter a web application www.todaysoftmag.com | no. 33/march, 2015

programming Developing Secure Applications in Java • the web application dynamically generates a web page that contains unreliable data • while generating the web page, the application does not prevent the data from the content run by the browser, such as JavaScript, HTML tags, HTML attributes, mouse events, Flash or ActiveX • when using a web browser, the victim visits the generated page which contains a malicious script that has been injected by unreliable data • the script comes from a web page that has been sent by the web server, the victim runs the malicious script in the framework of the web server domain • the victim breaches the policy of a web browser which says that the scripts in a domain should not be able to access resources or run code in another domain. The following example illustrates a XSS attack:

<%@page contentType=”text/html” pageEncoding=”UTF-8”%> <!DOCTYPE html> <html> <head> <meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8”> <title>Login Page</title> </head> <body> <h2>Please login</h2> <p>Please login</p> <form method=”GET” action=”/XssExample/ProcessForm”> <b>Login Name: </b> <input type=”text” size=”12” maxlength=”12” name=”userName” /> <br/> <b>Password: </b> <input type=”text” size=”12” maxlength=”12” name=”password” /> <br/> <b>Locatia: </b> <input type=”text” size=”12” name=”location” /><br/> <input type=”submit” value=”Submit” /> <input type=”reset” value=”Reset” /> </form> <p><a href=”http://localhost:8080/XSS/Pro cessForm?userName=Bob&password=Smith&location=</ p>%3CScript%20Language%3D%22Javascript%22%3Ealert(% 22Ai%20fost%20atacat!%22)%3B%3C%2FScript%3E”>Hacked URL</a></p> <p>URL Script text: %3CScript%20Lang uage%3D%22Javascript%22%3Ealert(%22vei%20fi%20 atacat!%22)%3B%3C%2FScript%3E</p> </body> </html>

Respectively the servlet: @WebServlet(“/ProcessForm”) public class ProcessForm extends HttpServlet { private static final long serialVersionUID = -5014955266331211217L; protected void processRequest( HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType(“text/html;charset=UTF-8”); PrintWriter out = response.getWriter(); try { out.println(“<html>”); out.println(“<head>”); out.println( “<title>processing Servlet</title>”); out.println(“</head>”); out.println(“<body>”); out.println(“<h2>First page</h2>”); out.println(“<p>you are logged as: “ + request.getParameter(“userName”) + “</p>”); out.println(“<p>and you are in: “ + request.getParameter(“location”) + “</p>”);

no. 33/march, 2015 | www.todaysoftmag.com

}

out.println(“</body>”); out.println(“</html>”); } finally { out.close(); }

@Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { processRequest(request, response); } @Override protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { processRequest(request, response); } @Override public String getServletInfo() { return “my Servlet”; } }

SQL Injection is based on unfiltered data in order to alter the SQL results. Let’s consider the following code: ResultSet rs = stmt.executeQuery(“ SELECT * FROM DEMO.Table1 WHERE NAME=’” + nameParam + “’ AND AGE =’” + ageParam + “’”);

If t h e at t a c k e r sends: ‘values’ OR ‘a’ = ‘a’, it will make the selection predicate true, which is equivalent to:

ResultSet rs = stmt.executeQuery(“SELECT * FROM DEMO. Table1”);

By doing this, the attacker can access private information or can even modify data in the data base. This is why any input must be filtered before being used. OS command injection is based on unfiltered data which alter the command of the operating system. Let’s consider the following example: public class ListHomeDir { public static void main(String[] args) { String userName = “Silviu”; String curLine = “”; try { Process p = Runtime.getRuntime().exec( “cmd /c dir C:\\Users\\” + userName); BufferedReader stdInput = new BufferedReader(new InputStreamReader( p.getInputStream())); BufferedReader stdError = new BufferedReader(new InputStreamReader(

TODAY SOFTWARE MAGAZINE p.getErrorStream())); System.out.println(“Home directory is:”); while ((curLine = stdInput.readLine()) != null) { System.out.println(curLine); } if (stdError.ready()) { System.out.println(“error:”); } while ((curLine = stdError.readLine()) != null) { System.out.println(curLine); } System.exit(0); } catch (IOException e) { System.out.println(“exception: “); System.out.println(e.getMessage()); System.exit(-1); } } }

In the example, we wish to obtain a listing of a director. The attacker can send: username;&& del *.*;, which can cause the loss of data. Let’s consider the following example of uncontrolled format of strings: public class Main { static Calendar c = new GregorianCalendar(1995, GregorianCalendar.MAY, 23); public static void main(String[] args) {String param = “%1$tY”; System.out. printf(param + “ Error !!! %1$te \n”, c); } }

In this code, the programmer tries to print the results when two values do not match. The problem appears when a formed string is sent instead of a month. The attacker can figure out the year, for instance, when a card becomes out of date. From the point of view of resource management, we have: • buffer overflows: copying an input buffer into an output buffer without checking the size. Its result is the fact that an attacker can run code outside a normal program. Java is immune to this type of attack, as it owns an automated management of memory. • denial of service: it is still possible in Java, through the inadequate usage of resources. Here are a few examples of potential denial-of-service attacks: • zip bomb: a zip file that is relatively small, but includes many other zips in it. Unzipping the files can block the processor and it can engulf a big storage space. As a protection measure we can limit the size and processing that can be

done inside a resource such as this. • billion laughs attack: if we are using the DOM API for a XML file, we must upload the entire file into the memory. Such an attack can engulf the entire memory. • Path: is a language for interrogations and crossings of XML files. Some interrogations can be recursive and can return a bigger volume of data than the one expected. • Object Graph: an object graph is built by parsing a text or a binary stream; it can require much more memory than the original data. Let’s consider the following example: public class FileException { Properties appProps = new Properties(); public static void main(String[] args) { FileException app = new FileException(); app.readProps(“AppConfig.properties”); app.printProps(); } public void readProps(String fileName) { try { appProps.load(new FileInputStream(fileName)); } catch (IOException e) { System.out.println(“Cannot find the file “+ “de configurare: “ + e.getMessage()); e.printStackTrace(); System.exit(-1); } } public void printProps() { appProps.list(System.out); } }

The system should not provide the potential attackers with the exact location of the configuration file of the application. Confidential information should be available to reading only in a limited context; they should not be available for manipulation; users should be provided only with the information they need; the information should not be hard coded in the sources. Private data should not be included in exceptions of log files. Also, we shouldn’t hard code the username and password into the source code. We should use an attributes file in order to store this type of information. Here is an example of creating and using a log file. public class BasicLogging { Logger logger = Logger.getLogger(“com.example. BasicLogging”); public void logMessages(){ logger.severe(“Critical error”); logger.warning(“ ”); logger.log(Level.INFO,”Usefull info”); logger.config(“Info about CONFIG”); }

public static void main(String[] args) { BasicLogging bl = new BasicLogging(); bl.logger = Logger.getLogger(“com.example”+ “.BasicLogging”);

}

try { bl.logger.addHandler(new FileHandler( “Basic.log”)); bl.logger.setLevel(Level.INFO); bl.logMessages(); } catch (IOException e){ e.getMessage(); }

We hope you have enjoyed reading this article and we are looking forward to your questions!

www.todaysoftmag.com | no. 33/march, 2015

management

programare

A simple approach for Risk Management in Scrum

Sebastian Botiș

Sebastian.Botis@endava.com Delivery Manager @ Endava

n traditional waterfall model, risks were usually managed by using project risk management frameworks. Nowadays, there is a kind of lack of formal risk management techniques in agile software development methods. Agile models claim to be risk-driven. By nature, due to its core concept, its iterative approach enables continuous attention to risks and the risks can be reduced by different practices like continuous software integration and early validation. Unfortunately, in reality, the agile model implements only a few risk management practices. This situation lead to different actions, and one was to obtain the opinion of different project managers involved in managing different projects all over the world. We will discuss a little bit about an interesting survey that was taken quite recently. Risk management is the discipline of identifying, monitoring and limiting risks. In ideal risk management, a prioritization process is followed whereby the risks with the greatest loss and the greatest probability of occurring are handled first, and risks with lower probability of occurrence and lower loss are handled in descending order. Effective risk management involves: • Identifying the risk • Analysing each risk to determine its exposure (severity of impact) • Prioritizing the identified risks based on their exposure • Creating action plans (responses) to deal with the high-priority risks • Continuous monitoring and followup to ensure that your action plans are mitigating the risks

I would start the discussion by trying to find a response and clarify the following two questions: • Does this mean risks need to be monitored at iteration level only? The answer is “No”. • Is it required to monitor risks at project level also? The answer is “Yes”. I would say that Risk Management in agile needs to be done at two levels, Project level and Iteration or Sprint level. • Project level Risk Management process is done at broader level, taking into consideration the whole project and its requirements • Iteration level Risk Management process is done at the iteration level encompassing the details of the iteration

These two processes seem to be separate, but they go hand in hand during the whole project execution. It is required to understand, identify, track, monitor and mitigate the risks at both levels. Project Level Risk Management process would provide inputs for the Iteration Level Risk Management Risk management (RM) in agile process. Bear in mind that not all of the process is similar to the conventional Project Level risks will be part of all the Risk Management approach with slight iterations. Some of the risks could be deviations. different and others may be unique for a

no. 33/2015, www.todaysoftmag.com

programare

TODAY SOFTWARE MAGAZINE

specific iteration alone. Risk Monitoring happens during the iterations and Risk review sessions happen between iterations. It is strongly required to know where the project and iterations are in terms of risks management.

Check list

I believe every Project Manager, Scrum master and the team, should understand the requirements to arrive at the major risks that the project would face. Most probably, creating a checklist (a sample one can be found below) that would help in identifying potential risks, curtail them in the initial stages or plan for mitigation of the same, would bring huge cost savings in time.

Area

Description

Requirements

Are the user stories ready?

Requirements

Has the product owner identified the end goal of the project?

Design

Is the design and architecture done already or is it going to be a separate iteration by itself?

Design

If the design and architecture is in place, has the team understood the same?

Process

Has the team undergone agile project methodology training earlier? If not, is it required to be conducted?

Process

Is the whole team trained and understands Agile before embarking on to the project?

Yes/No

Process

Does the scrum master understand clearly about the metrics that need to be followed for Agile Projects?

I t e r a t i o n Is the scrum master aware of the back Management log items? I t e r a t i o n Does he know about the estimation Management methodology in agile? I t e r a t i o n Is there any tool that is being used for Management project management activities? Infrastructure

Is the infrastructure ready for the project execution?

Infrastructure

Is the communication channel established if the team is globally deployed?

Risk Management checklist created prior to a meeting can be brought to the meeting to help facilitate the identification of risks. During the meetings, all the risks details are identified, discussed and consensus is reached on the actions items for addressing the risks. Project Risk Register or Iteration Risk register are created and shared within the team, which are referred and updated during the course of the project or iteration. Ownership and maintenance of these artefacts lies with the Project Manager/Scrum Master.

Project Level Risk Management Process

Project Level Risk Management process involves Risk identification, Planning, Monitoring and closing activities at the end of the project. Each one of the step has it significance which is explained below. Typical Risk Management activities at the project level are as below:

www.todaysoftmag.com | no. 33/march, 2015

management

testare A simple approach for Risk Management in Scrum

Risk Identification

Start of iteration

At the beginning of the project, risks for the project are idenAt the beginning of the iteration, Risk Management meeting tified in a broader perspective. Check list (if created) would be of is conducted. This involves various steps like Understanding, help in the risk identification process. Identifying, Analysing, Prioritizing and Mapping. Each step has its own significance. This brainstorming discussion would require Risk Planning 2 to 3 hours, involving everyone in the team. Inputs from the During the planning session, mitigation and contingency project level discussion may be taken as inputs for this discussion. plans are put forward for the identified risks. Risk register is Each step is elaborated below: created at project level by the Scrum Master / Project Manager Understand: The team should clearly understand the user which is referenced by the team as required. This is done at the requirements, both functional and non-functional. Requirements beginning of the project only. Project Risk register is updated understanding would be very helpful in the risk identification during the course of the project. process of the iteration. Identify: Once we have an initial draft of the product backlog, Risk Monitoring with a good understanding on the requirements, the team could All the risks identified earlier are reviewed and monitored proceed with the identification process. During the session, each between the iterations of the project. Risk Register is updated for team member should identify any potential risks. There are many the risks which are closed and any new risks that are identified ways to run this session, but a simple and efficient way would be along the way of the project. This information is then used in the to proceed similarly as in identifying stories in the backlog, by iteration level discussion. using sticky notes. As a best practice, neither questions should be asked nor should discussions happen during this session. Why this? Because this should be a time boxed session without overcome time. Analyse: This time, the collaboration within the team will play an important role. Each identified risk is analysed and grouped into logical categories or areas: e.g. infrastructure, process, third parties, etc. Along with this process, each risk is rated (it doesnâ&#x20AC;&#x2122;t matter really the scale, but it should be kept simple). Once all these are completed, points/ rating for the risks grouped, they are counted. Instead of points or ratings, another possibility would be to assign probability and impact scores. Prioritize: Once the totalling of the points/ratings for each group is done, they are ordered in the descending order, with the risk group which has the highest risk rating Risk Management Process at the end of the project topping the list. For the iteration, take the top 3 risks groups Update the risk register with details of what was done when and refer the rest of the risks groups for future debates. When the risk has been confronted during the course of the project. This new iteration is executed, the previous version of the risk regiswould be a good reference for future projects. Ensure that the best ter (updated during the entire previous iteration execution) is practices and lessons learnt are created and shared. checked to see if some risks are still present. The same process will be repeating until the end of the project.

Below can be seen the Risk Management Process for projects executed in more Agile environment Iteration/Sprint Level Risk Management Process

Iteration/Sprint level Risk Management process begins right after the initial Project level Risk identification and planning process. During the iteration level process, Risk Management involves multiple steps. There are broadly two Risk Management processes for an iteration which are executed at Identification & planning - at the START of the iteration and Monitoring during the iteration.

no. 33/march, 2015 | www.todaysoftmag.com

Map: Mapping is a quick exercise which is done just before kick starting the execution. Top 5 risks identified are mapped back to the backlog/requirement. This would be essential in keeping a close watch when the requirement is being executed. If there is no requirement/backlog, create one. Now create an iteration risk register with the mitigation plan or the risk response plan for each of the risks identified. This is going to be reference point for the team during the course of the iteration. This document is updated during the iteration execution cycle. The picture below shows typical Risk Management Process in an iteration, depicting all the steps involved:

TODAY SOFTWARE MAGAZINE Respondents were also asked what scrum meetings are used as support to discuss the risks with development team. Please check the results on the below graph.

Conclusion

During Iteration

Monitoring is a key of Risk Management. This happens during the iteration execution. Team responds and acts to the risks as and when they come up during the execution of the project/iteration as per risk register. Scrum master/Project Manager keeps a track of each risk for the iteration. In case the deferred come up as a risk during the iteration, it would be taken up as part of the next iteration. No time should be wasted on the deferred unless they become crucial to be addressed as part of that iteration.

Survey

There is a lot of useful information and other details that could be added in terms of specific instruments that could be used to perform the identifications, ranking/rated (based on qualitative and quantitative customized indicators) and tracking of risks during the sprints executions (see Risk Burndown Graph based on counting all risks exposure). Even though using Agile methodology reduces risk in early phases of software development, we should also consider the idea that there is a demand to start thinking about making more room to risk management on a more formalized way. “Risk is idle work, not idle workers” – Ken Rubin

Resources: 1.

Last year, different surveys took place all over the world having 2. involved different project managers working in different big or small companies. Recently, I found out an interesting survey that 3. took place on East Europe, having involved around 70 project managers, with the objective to find out the current practices in agile project management with regards to risk management.

Risk Management in Agile Model – Monika Singh & Ruhi Saxena Project Risk Management model based on Prince2 and Scrum Frameworks – Martin Tomanel and Jan Juricek Three Key Agile Risk Management Activities – Ken Rubin

The results of the survey: • • 67 % of respondents review the project risks on weekly basis. • • 87 % of respondents stated that the project manager is formally responsible for managing risks. • • 27 % of respondents stated that also the product owner is formally responsible for managing risks. • • Respondents were asked who primarily identifies the risks. The project team members • (80%), the project manager (67 %), the development team member (40%) and the scrum master (40%). • • 73 % of respondents use a risk log or risk register to document and maintain risks. The rest of respondents 27 % use different techniques or don’t document risks at all. Respondents were also asked what risk attributes they document and keep watching. Please check the results on the below graph.

www.todaysoftmag.com | no. 33/march, 2015

design

The importance of prototyping

got my first job as a UX Designer at a multinational company back in 2011. Since then I had the opportunity to work with different methodologies of developing a design, succeeding to discover the pros and cons of each of those methodologies.Currently I’m employed as a UX Designer, I work with a team of 17 designers in an Agile environment, all with the main purpose of delivering the best experience for the users of our software products developed by the company. The innovative environment I currently have the chance to be part of and the freedom of experimenting with diverse methodologies, allowed me to try new procedures for developing a design. Comparing to the previous methods I used, I have finally decided to stop on a single one, which I consider to be the best in delivering a successful design: using prototypes. Therefore, I consider that the achievement of a product must always begin with a prototype, no matter if the product is a software or a hardware product. The prototype gives a clear idea and an exact preview of the final product. The prototype is a clone; it’s a screenshot of the future product, which will become fully functional after the implementation is completed. For me, there is no exception from this rule. For each project that I work on, I start with an overall idea, designed on a piece of paper, reaching up to the point where it may be constructed as a prototype of the final product, in the smallest detail, regardless of whether it is a layout of an interactive website or a 3D simulation of an object. The process is carried out in iterations. With each iteration, the prototype gains more details, becomes more powerful, up to the moment when it is considered functional and ready for implementation. The concept of iteration is as follows: a. The first iteration is always the part where all the necessary information is gathered from the experts in the field, for the product for which the design is

created. All the information is rechecked by the team, from a technical point of view. We then determine what functionality is desired and then I draw the first sketch of the future product and determine the flow of the design for the project that will be carried out. b. The second iteration focuses in particular on the interaction with the product and on the validation of the functionality set out in the first iteration. What we are focusing on in particular in this iteration is the way in which the prototype is used by the target audience. More exactly, we make available for it a first version of the prototype and we analyze the way the user interacts with it. We want to observe if the user feels comfortable with the position of different graphical elements, if it is lingering after a long usage, if he/she identifies the main components easily, etc. c. Finally, in the last iteration of the design process, the focus is on the visual details and finishing the prototype that has been carried out. The prototype gets a commercial shape, through complex graphical elements and by repositioning its components, if necessary, based on the feedback resulted from the user testing feedback. Collecting this feedback is essential, because, when the design is validated, it goes in the implementation step with all the specifications clear and complete and it will finally be transformed into the product desired by its purchaser.

no. 33/march, 2015 | www.todaysoftmag.com

Iterative process to design a prototype

Each designer has its own preferences regarding the tool to be used for carrying out his tasks and creating Prototypes. Personally, I prefer using Axure to create prototypes and pencil and paper whenever I need to sketch an idea fast. Why Axure? Because you can develop an html prototype which evolves in an organic way, from sketchy drawings early in the first iteration, through to a fully designed prototype ready for user-testing and build. However, the best tool is the one you feel the most comfortable with and you can create your designs in a reasonable amount of time with. What really matters is the final design, a design that needs to be easyto-understand, usable and implementable. Creating a prototype has only advantages and from my experience with different ways of working, it is always a preferred method, rather than creating a product by following the principle “go ahead, do this on-the-go and improve along the way”. The product stakeholder will know exactly what enters into production and what you receive at the end of this. I will elaborate 5 advantages of using a prototype:

TODAY SOFTWARE MAGAZINE 1. It simulates the actual product

The most important advantage of a prototype is that it simulates the actual and exact future product. Using it can attract customers to invest in the product, before assigning any necessary resource to development process. The design correctness can be tested before it goes in production, where design errors might be discovered and compromising your whole project. Also, a prototype is tested on a representative number of users and helps you to discover up front how they interact with the product and to know their expectations.

2. It challenges you to come up with new ideas

Each stakeholder and user has his own vision on the product that needs to be implemented and, basically, wishes this vision to be found in the final product. Presenting the prototype helps gauge all the ideas and gives to the beneficiary the possibility to see the product from another perspective, to see it materialized and to provide a feedback focused on the desired details, on what they initially had in its mind. Starting from a low-fidelity prototype that is focused on design flows and which directs the user to review functionalities and the steps that need to be followed, we get to the point where we have a highfidelity prototype, a point where we obtain feedback regarding the visual details, such as fonts, colors, alignment, button size, etc. Feedback is essential to find out what the needs and expectations of users are, the requirements of the business, and to have a clear idea of the direction in which the product is heading.

3. It prevents any major problems

Creating a prototype which can be tested as soon as possible makes it possible to resolve major problems before they cause financial damages, if it goes into production. By testing the prototype on real users, the suggestions and expectations of the users are obtained in advance, before the product is released. Between the multiple existent methods to trace issues faced by users, Google Analytics for example, a tool I use very often, provides the ability to interpret the data for the usage of a product by a user located in his comfort zone, without knowing that the user is being analyzed. Their behavior in the interaction with the product will be natural and realistic, other than when they are invited to a session of

user testing, a session in which they would know that they are being analyzed. However, user testing in which the user is monitored live is the best way to deal with the problems they are confronting, since there is a possibility to gather as much feedback and see exactly where the user is failing in carrying out the usage tasks. In our work environment, we send user surveys and we select potential users for testing. From all of them, we select the ones with higher chances of using the product heavily and if they also check the option of being available to participate to the usability-testing sessions. These sessions are carried out online. Each user shares their screen in order to demonstrate the way they interact with the prototype. Our role is to trace those design sections where the user gets into difficulty. According to the information collected from these sessions, we update the prototype and afterwards, we organize another testing session.

Conclusion

I trust that this article has brought a great value on the prototype topic in the production software environment and has shown how important it is to have it. Its impact on the final product is phenomenal. It is not just that it prevents problems in the production environment but also protects the company from unforeseen costs, and streamlines the work flows, forms an overall view to all parties involved (programmers, testing engineers, product owners, stakeholders) and it is also an excellent resource that can be used in pre-sales.

4. Planning

The teams which implement the design receive essential information that helps them to plan what needs to be implemented. A prototype may be considered, most often, the specification of the project and helps the entire team to create user stories and focus on the users’ needs. As long as it is carried out in time, before the beginning of a Sprint, it only brings benefits for the Scrum teams. From my experience, what I noticed is that having an interactive prototype helps the programmers involved in the implementation process to have a more clear idea on the way they have to think out the interaction of the design components, which saves the company time and money. Through the prototype the message is much clearer of what needs to be implemented in reality, more exactly, the vision from the first iteration is materialized.

5. It’s quick and easy to create

Even a stakeholder of the product can build a prototype. What is essential is to provide a simple idea on paper, for the designer to understand the functionalities and the product logic. This simple sketch, which, for example, can be an illustration with a few buttons for a web site, will be converted by an experienced designer into a complex product, highly detailed and ready to be approved for implementation.

Cătălin Timofti

ctimofti@sdl.com UX Designer @ SDL

www.todaysoftmag.com | no. 33/march, 2015

programming

Our Java Chronicle in action case

rocessing quite large text files is not an easy thing to accomplish even with all the “good guys” around like Hadoop, powerful machines, concurrency frameworks (others than Java concurrency utilities). And this because using those comes with a cost (and here we can mention money, time, or persons with necessary qualification), that not all the time is negligible and in the same time with limitations. For example, if you have to validate some of the content with a 3rd service, using Hadoop for this is a well-known anti-pattern. Using powerful machines is debatable from project to project; maybe some client just doesn’t want to pay extra money for making faster a single functionality that is not even called that often. Using concurrency frameworks can still be a huge impediment, now with all those actor frameworks in place, the trend is to know less about how it runs (unfortunately) and just to be good at using it even for plain Java concurrency package. Img. Product official definition and usage - Chronicle-Queue (2015)

I know what you are thinking now; you can just read the file line by line processing it, then save state, using buffers and clean java code – let’s call this statement 1.

How we used it:

But wait, this is not all. What if I tell you, processing of file should be a nice atomic action, where validations are being made over each line and counts or other metadata from header of the file or trailer, even entities groups inside the file? And if the file is valid (based on what that means for each business requirements) the process has to save some events for each processed line. Based on the above applying statement 1 will not deserve our case anymore because we have to provide atomic processing. Saving the processed lines in memory till the file is processed will lead you to a nice OOME (Out of memory exception) – for large ones. Presenting you our solution: Chronicle. Java Chronicle. What is this product? “Inter Process Communication ( IPC ) with sub millisecond latency and able to store every message:”

no. 33/march, 2015 | www.todaysoftmag.com

Process: 1. create chronicles (IndexedChronicle) ChronicleConfig config = ChronicleConfig.DEFAULT. clone(); Chronicle entitiesChronicle = new IndexedChronicle(“path”, config);

2. reading the lines from file 3. unmarshalling (with beanIO – out of the scope of this article) to POJO, 4. validating the content of entity 5. create additional entities (business requirements) using

TODAY SOFTWARE MAGAZINE info from in process entity 6. serialize (BytesMarshallable) entities public void writeMarshallable(@NotNull Bytes out) { if (null == entityUuid) { out.writeBoolean(false); } else { out.writeBoolean(true); out.writeUTFΔ(entityUuid.toString()); } … writeListEntityMessages(messages, out); out.writeStopBit(-1); }

7. write them in Chronicle (ExcerptAppender) // Start an excerpt with given chunksize int objectSize = getObjectSize(entity); //how many bytes entitiesAppender.startExcerpt(objectSize); // Write the object bytes entity.writeMarshallable(entitiesAppender); // pad it for later. entitiesAppender.position(objectSize);

8. after all content used from file, if “checks” are passed then 9. read from Chronicle (ExcerptTailer)

And since a number can say more than 1000 words: Keeping all those entities in heap will not be possible within commodity hardware (and without any involvement from gc), but using off-heap memory becomes a trivial action using chronicle.

Conclusions

As you can see, the write and read entities from chronicle (memory-mapped file) take about the same time as read lines, so instead of reading the file twice as an alternative, you can use chronicle and get unmarshalling of based entities for free. It took us two hours till we were able to save our first entities; the API is nice and clean, easy to use and understand. The latest version of library has even more specialized functionalities like: • chronicle queue • chronicle map • chronicle logger • chronicle engine • chronicle set Definitely there are some other things to address and analyze but based on our needs that fit the requirements with the cheapest effort.

ExcerptTailer reader = entitiesChronicle.createTailer(); Entity entity = new Entity (); entity.readMarshallable(reader);

10. deserialize

public void readMarshallable(@NotNull Bytes in) throws IllegalStateException { StringBuilder valueToRead = new StringBuilder(100); boolean hasId = in.readBoolean(); if (hasId) { entityUuid = readUuidValue(valueToRead, in); } … messages = readListEntityMessages(in); in.readStopBit(); }

Vasile Mihali

vasile.mihali@arobs.com

11. save final state to Cassandra 12. delete chronicles

Senior Software Engineer @ Arobs

entitiesChronicle.close(); entitiesChronicle.clear();

Nr. linii

Citire linii Entitate (de bază) (millis)

FileEntity (adiacentă)

BusinessEntity1 BusinessEntity2 S c r i e r e î n C i t i r e d i n (adiacentă) (adiacentă) c ron i c ă c ron i c ă (millis) (millis)

1068107 (~200M)

850

5000

918

350

2136214 (~400M)

1312

10000

1290

448

12817284 (~2.3G)

7423

60000

5071

1530

25634568 (~4.7G)

13382

120000

9026

3154

CPU Intel i5-2410M @2.3GHz, 16GB Ram, JVM - 1GB

www.todaysoftmag.com | no. 33/march, 2015

programming

Getting Started with Apache Cassandra

here was a time when NoSQL was considered a trend, a buzzword, essentially something that didn’t apply to common use cases. Nowadays, I believe that NoSQL is no longer a trend and that even medium-sized companies are faced with the concern of increasing data volume. In this situation, the use case of migrating from a relational DB to a NoSQL DB is becoming more common due to the main advantage of this technology, the ability to scale data in clusters. This article will present how to setup a Cassandra database instance and how to create a Cassandra cluster by adding more nodes. We will see look at the Cassandra shell and query language (CQL), as well as how to write a simple Java class that uses a Cassandra API client. Finally we will discuss the limitations of this NoSQL solution, how to overcome them and some general recommendations.

Cassandra Setup

Cassandra DB is one of the most popular NoSQL DBs providing the best results for performance scalability and the ability to partition data across clusters at no cost (Apache License 2.0). The following chart created by DataStax features a comparison between the most commonly used NoSQL DBs highlighting Cassandra as a clear leader with HBase in 2nd place.

Install Requirements 1. JDK 7 2. Python 2.7.x The database can be installed on Linux as well as Windows and the required setup is fairly easy: download and extract the software archive, configure the cassandra.yaml configuration file with a few storage folder paths and run the cassandra executable. After the shell is started, everything should look very familiar; the CQL (Cassandra Query Language) is almost identical to SQL. cqlsh:demo> ... factor’:1}; cqlsh:demo> ... ... ... cqlsh:demo> ... cqlsh:demo> cqlsh:demo>

create keyspace demo with replication = {‘class’:’SimpleStrategy’, ‘replication_ create table users ( id varchar primary key, name varchar ); insert into users (id, name) values (‘1’, ‘John’); select * from users where id = ‘1’;

no. 33/march, 2015 | www.todaysoftmag.com

id | name ----+-----1 | John (1 rows)

The one thing that might look unfamiliar from the above example is the keyspace concept which is similar to the Oracle schema and contains configuration regarding how data is replicated inside the Cassandra cluster.

Cassandra Clusters

In order to scale your application’s performance, Cassandra allows the configuration of clusters of Cassandra nodes. Through the use of clusters, applications can achieve “... continuous availability, linear scalability, and operational simplicity across many commodity servers with no single point of failure ...”. When using cluster setup, you can define the amount of data that a certain node can handle (through the use of tokens) and where a piece of data should go inside the cluster (see Partitioners), and you can configure the replication strategy which specifies the number of copies (replicas) from each row that must exist in the cluster at a given time. To setup a simple Cassandra cluster simply edit the cassandra. yaml configuration file and specify the following cluster_name: ‘Test Cluster’ seed_provider: # Addresses of hosts that are deemed contact points. # Cassandra nodes use this list of hosts to find each other and learn # the topology of the ring. You must change this if you are running # multiple nodes! - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: “16.22.70.184” listen_address: 16.22.70.184

After the configuration file is edited, install Cassandra and copy the cassandra.yaml file on all nodes. The cluster_name value must be the same on all nodes in the cluster. The seed nodes are used so that each Cassandra node can discover the cluster topology (For this simple example choose one node and set its IP as seed on all nodes). Set the listen_address value to the IP address of the node. It will be used to communicate with the other nodes. # Node 1 Configuration cluster_name: ‘Test Cluster’ seed_provider: - class_name: org.apache.cassandra.locator. SimpleSeedProvider

legal

TODAY SOFTWARE MAGAZINE

The getUsers method executes the raw select CQL and returns all the row instances (only use for a small number of instances since it forces the retrieval of all results). The DataStax API also supports the advanced execution mechanisms present in classic # Node 2 Configuration database clients like prepared and batch statements (although, be cluster_name: ‘Test Cluster’ careful, batches should not be used for performance), as well as seed_provider: options like fetch size, reconnect and retry policy. - class_name: org.apache.cassandra.locator. Besides constructing CQL statements, DataStax provides SimpleSeedProvider a QueryBuilder API that allows the dynamic creation of queparameters: ries with the added benefit of eliminating injection attacks. The - seeds: “16.22.70.184” getUsersUsingQueryBuilder method illustrates the use of this listen_address: 16.22.70.193 mechanism. The output of the CassandraTestClient.main() method should After all nodes have been set up, start each of them (using be: the cassandra executable). To make sure a Cassandra instance is Connected to cluster: Test Cluster successfully started look for the following line in the Cassandra [Row[1, John]] [Row[1, John]] startup shell: “Listening for thrift clients...”. After the nodes have been started, to verify the cluster status, use the command below: Cassandra limitations and how to overcome them Even though Cassandra features a lot of advantages over the e:\programs\apache-cassandra-2.1.2\bin> nodetool status Starting NodeTool classic RDBMS, it also has some significant drawbacks. The main Datacenter: datacenter1 disadvantages are ======================== Status=Up/Down i. No Transactions |/ State=Normal/Leaving/Joining/Moving ii. No Joins -- Address Load Tokens Owns Host ID Rack iii. No Complex Queries parameters: - seeds: “16.22.70.184” listen_address: 16.22.70.184

DN 16.59.61.100 ? b218-4ee2-904a-0db4ff5e28a3 DN 16.60.161.225 ? 26a4-4e1f-80fc-12dc16480319 UN 16.22.70.184 330.11 KB abe3-406a-96d8-12c12382b075 DN 16.22.69.37 ? 41e2-4346-a86e-ec443fbd2fa5

256 rack1 256 rack1 256 rack1 256 rack1

9ae29fc3-

48045f86-

8b42a1e4-

c3b3feef-

The nodetool utility used above can also be used to remove/ add nodes from/to the cluster as well as other maintenance and management tasks.

Sample Java Code When it comes to API clients, Cassandra offers a wide range of choices depending on the used programming language. At least 6 of the API clients are written in Java with DataStax, Astyanax and Hector being popular choices. For the examples below we are going to use the DataStax API since it seems to be one of the mature choices with clear and detailed documentation. (The company does a lot of active work in the Cassandra ecosystem, providing good documentation for this database and developed a commercial solution around Cassandra). In order to write a simple Java client application for Cassandra, simply add cassandra-driver-core dependency below in the Maven pom.xml file. <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>2.1.3</version> </dependency>

The class below connects to the local Cassandra instance and queries the users from the previously created table. In the above code sample, the client code connects to the Cassandra cluster using the given contact node (the local machine in this case) which is responsible for providing the cluster topology to the client. After obtaining the Session object we can construct and execute queries very similar to the way we would use an Oracle database.

No Transactions

Cassandra does not offer the guarantee of atomicity and isolation at the transaction level (in fact the transaction concept does not exist); however, it does ensure those properties at the row level. Considering the nature of Cassandra (being a column-oriented key-value store), the row level guarantee makes sense from a design point of view since each row should represent a whole composite entity (and atomic and isolated access to that entity should suffice). When it comes to ensuring data consistency at the cluster level, Cassandra provides the concept of configurable data consistency by defining different consistency levels for read and write operations. These consistency levels can be summarized by how many nodes within the cluster need to acknowledge the read/ write operation. Also, in an effort to aid developers in ensuring data consistency, starting with Cassandra 2.0, support for lightweight transactions has been added. Although the name is promising, the feature actually provides compare and set type operations like only creating a new row if it does not already exist or updating a row based on a condition. Regarding the topic of missing transactions, the widely known software engineer and author, Martin Fowler, says it should not be taken as a major disadvantage since most of the time keeping transactions open for a long period of time can seriously affect performance. Thus in [9] he proposes the solution of offline locks in which all data is versioned and every commit requires the verification of the version information for the data set involved in the operation. The following diagram illustrates this concept.

No Joins

This widely known limitation is something you have to start off it. Creating (or moving) an application on top of Cassandra will require some NoSQL thinking. This means redesigning your database around aggregate models (don’t store a table of users www.todaysoftmag.com | no. 33/march, 2015

programming Getting Started with Apache Cassandra and a table of addresses, store just one user table where one user record includes all the needed address information as well) even if it means some redundancy or major DAO refactor. However, depending on your business model, using aggregates in persistency might simplify a lot of the data access layer.

Conclusion

NoSQL is an important step towards developing data persistence mechanisms that requires a different approach in the way we design applications. Whether we do it of necessity or programmatic curiosity, I believe that NoSQL is a piece of technology that is worth learning.

References 4.

No Complex Queries

Introduction to NoSQL by Martin Fowler, https://www.youtube.com/ watch?v=qI_g07C_Q5I 5. Apache Cassandra 2.1 Documentation, http://www.datastax.com/documentation/cassandra/2.1/cassandra/gettingStartedCassandraIntro.html 6. Apache Cassandra Installation on Windows, http://kimola.com/articles/ cassandra-installation-on-windows-platform 7. datastaxJava Driver 2.1 for Cassandra, http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/whatsNew2.html 8. Apache Hive, https://hive.apache.org/ 9. Apache Pig, http://pig.apache.org/ 10. Cassandra High Performance Cookbook by Edward Capriolo 11. C a s s a n d r a B a t c h L o a d i n g w i t h o u t t h e B a t c h keyword, https://medium.com/@foundev/ cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e 12. Optimistic Offline Lock, http://martinfowler.com/eaaCatalog/optimisticOfflineLock.html

If your application requires a lot of flexibility regarding query conditions, Cassandra will not be very helpful. In fact, you will not be able to • set a condition on a table column that is not part of the primary key or indexed • In the demo.users example above we cannot execute the CQL select * from users where name = ‘John’;

without explicitly creating an index on the name column • set a condition using one of the IN, =, >, >=, <, or <= operators without adding an equality condition on at least one of the key columns • If we define an index on the name column in the demo. users example above using the CQL create index users_name_index on users(name);

we can proceed to execute the query

select * from users where name = ‘John’;

however, we cannot execute the similar query

select * from users where name in (‘John’);

because there is no equality condition In this situation you are stuck implementing some queries manually. Luckily, most NoSQL solutions support Hadoop integrations and in this specific case, Cassandra supports two high level query frameworks, Apache Hive [5] and Apache Pig [6] (integration documentation is very hard to find on the internet, I recommend finding books containing detailed integration steps). Apache Hive offers an SQL like interface that should feel like never leaving CQL, while Apache Pig features a programming language interface that allows more control over the executed queries.

no. 33/march, 2015 | www.todaysoftmag.com

Sergiu Indrie

sergiu-mircea.indrie@hp.com Software Engineer @ HP

testing

TODAY SOFTWARE MAGAZINE

Appium – automated cross-platform testing for mobile devices

he necessity of writing code which can be run on several platforms is not a new piece of news. The idea was at the base of the virtual machine of Java, on the Mono platform. This need was highlighted by the appearance of the mobile operating systems IOS and Android (Windows Mobile is not the topic of this article, though a great part of the enumerated platforms are compatible). There is a huge number of applications for both platforms, most of them trying to keep the same interface, in spite of the visual particularity of the platform. The existence of two separate development platforms for the same application has another important disadvantage (partially counter-balanced maybe by the cases of the applications which require maximum performances): the necessity of having separate development teams, with expertise in different languages (Objective-C and Java) which need to be synchronized from the management point of view in order to offer functions and similar experience for the users of the two operating systems. The increased need of a unitary development platform has generated a great number of solutions: Xamarin (probably the most popular solution, based on C#), PhoneGap (HTML5/JavaScript), Appcelerator (JavaScript), Sencha, Corona, etc. And since automated testing has the exact same need, out of the same reasons, the solutions didn’t take long to appear. We will go on to talk about Appium, one of the most complete solutions (Calabash is another solution which addresses the same problems, but in a different manner). Traditionally, automated tests for Android use the uiautomator framework from Google, a solution that uses Java as a programming language, or monkeyrunner, a simpler tool based on Python. The IOS tests use UI Automation API, a solution natively offered by Apple, which uses JavaScript as a language. Both solutions require relatively deep knowledge on the specific of the platform, but the main problem is the fact that there is no compatibility between them. Any tester who is going to write automated tests for both the platforms will have to know two programming languages (it sounds nice up till here…), to maintain two different testing environments, to maintain two separate sets of tests, even in the case where the flow is exactly the same on both platforms. The next step towards the unification of the testing environment was the appearance of Selendroid, respectively ios-driver, solutions based on Selenium WebDriver API, respectively on the JSON Wire protocol. In other words, these solutions are a middleware situated between the automated testing based on native solutions and the familiar code from Selenium WebDriver. However, these solutions do not solve the initial problems completely, as they merely reduce the programming language to Java.

The real unification of the environment of automated testing development is brought by Appium (initially used for testing IOS applications). Appium is based on 4 essential ideas: • The tested application has to be unaltered, meaning that it cannot be instrumentation or rooting, alterations that can incapacitate the tests from being realistic – Appium doesn’t require instrumentation and it works with any device. • High flexibility: the existence on the option of several programming languages, used libraries – there are Appium clients for Java, Python, JavaScript, C#, etc.; it can easily integrate with validating libraries such as TestNG or jUnit • Using established tools – WebDriver is the mostly used solution for web testing, so it is also used in Appium for the testing of mobile devices. • Open-source – Appium is open-source and it is supported by a powerful community. As it is built over Selenium WebDriver, Appium is not reduced to testing native applications, but it also simplifies the testing of hybrid mobile or web applications, basically reducing the test or its pertaining parts to a Selenium classical test.

Appium Architecture

Appium is built based on the client/server architecture, the server side being based on node.js, and the client one being represented by the list of clients available for the above enumerated languages. The communication between server and client is done through REST, this fact allowing the running of remote tests, on other machines than the one to which the physical device or emulator is connected. There are even cloud services for running tests, Sauce Labs being the most important service of this type. The easiest way of starting the application is using a GUI wrapper, available on the Appium site, which does not require the installation of node.js and the pertaining module. In addition, it also offers a graphic Inspector, useful in the development process for identifying the application components. www.todaysoftmag.com | no. 33/march, 2015

testing Appium – automated cross-platform testing for mobile devices

Since Appium is an additional layer over the native automa- 10 tablet: tion platforms, there are certain conditions for running the tests. Basically, we are talking about the fact that the IOS tests can be @BeforeMethod public void setUp(){ run only on the MAC OSX platform, because of the dependency File app = new File(“appFolder”, “myApp.apk”); DesiredCapabilities capabilities = on XCode (of course, they can be run remotely, from a Windows new DesiredCapabilities(); machine). On the other hand, Android tests can be run from capabilities.setCapability(“platformName”, machines using Windows, OSX or Linux, the main condition MobilePlatform.ANDROID); being the installation and configuration of Android SDK on the capabilities.setCapability(“deviceName”, “Nexus 10”); respective machine. capabilities.setCapability(“platformVersion”,

Code equals 1000 images, which equal 1000 words each

The approach from the point of view of the code is roughly similar to the code written on a platform based on Selenium WebDriver, with certain differences. Below, we will present a code organization which mainly aims towards the simplification of test maintenance, by heavily using the PageFactory pattern together with the Appium version of the assertions specific to Selenium. As a test framework, we will use TestNG, which is more useful in automated testing than jUnit, but Appium is complete in this respect. We will imagine the login part of an application with an identical look on Android and IOS, which has a splash screen on starting. The login screen is simple; it contains two text fields for user and password and a SignIn button. There is also a label item, which will display an error message if the user data is incorrect. The easiest way to create an Appium project is to use Maven, the pertaining configuration file pom.xml will have to contain the Selenium dependencies and the following section (according to the current version of Appium Java client): <dependency> <groupId>io.appium</groupId> <artifactId>java-client</artifactId> <version>2.1.0</version> </dependency>

It all begins with test preparation; a practice that is often useful is the creation of a parent class of the BaseTest type, which will be inherited by the test classes. This is the place where we can do the initial configuration for the sequence of tests in a method that has the @BeforeSuite annotation, for each generic test in the method marked with @BeforeTest and, of course, the tearDown part. The easiest alternative would require the initiation of an object of the AppiumDriver type (or AndroidDriver/IOSDriver in the newer versions) in the setup method of the tests, attaching a DesiredCapabilities object to it, which basically provides information on the used device. This object will be used during the test to send orders towards the mobile device and to extract information. Once the test completed, the object will be explicitly closed in the teardown method. Below, there is an example for a Nexus

no. 33/march, 2015 | www.todaysoftmag.com

}

“5.0.1”); capabilities.setCapability(“app”, app.getAbsolutePath()); capabilities.setCapability(“appPackage”, “org.myapp.demo”;); capabilities.setCapability(“appActivity”, “org.myapp.SplashActivity”); capabilities.setCapability(“noReset”, false); AndroidDriver driver = new AndroidDriver( new URL(“http://127.0.0.1:4723/wd/hub”), capabilities);

The essential parts in this configuration are: app – the location of the application, which will automatically be uploaded into the tablet; appPackage – specifying which packet will be run (specific to Android), appActivity – specifying which activity has to wait for the test, noResult/ fullReset – specifying if the application is going to be reinitialized, respectively reinstalled. The tearDown part will close the application and the driver object: @AfterMethod public void tearDown() throws Exception { driver.closeApp(); driver.quit(); }

In order to simplify the comprehension of the examples, we are going to list the hierarchy of the section on the screen, for both platforms:

Android

<android.widget.RelativeLayout resource-id=”org.myapp.demo:id/signin_layout” class=”android.widget.RelativeLayout” package=”org.myapp.demo”> <android.widget.EditText text=”JohnDoe” resource-id=”org.myapp.demo:id/signin_email_text” class=”android.widget.EditText” package=”org.myapp.demo”/> <android.widget.EditText resource-id=”org.myapp.demo:id/signin_password_text” password=”true” class=”android.widget.EditText” package=”org.myapp.demo” /> <android.widget.TextView resource-id=”org.myapp.demo:id/signin_result” class=”android.widget.TextView” package=”org.myapp.demo”/> <android.widget.Button text=”Sign In” resource-id=”org.myapp.demo:id/signin_ok_button” class=”android.widget.Button”

TODAY SOFTWARE MAGAZINE package=”org.myapp.demo” contentdesc=”SignInButton”/> </android.widget.RelativeLayout>

here below, the explanations being presented in the form of Java comments, starting from the premise of a self-describing code.

IOS

public class LoginScreen { final String PKG = “org.myapp.demo”; private AppiumDriver driver;

The test presented will be simple, but enough to present the concept and design of the tests. The steps are: 1. The user starts the application and checks the appearance of the SignIn screen, the existence of a demonstrative text in the username field (“JohnDoe”) and the fact that the SignIn button is grayed-out. 2. The user performs a touch action in the username field and checks the disappearance of the initial demonstrative text. 3. The user introduces the correct username and an incorrect password and checks that the SignIn button is activated. 4. The user performs a touch action on the SignIn button, validates the appearance of an error message and that the button is still visible.

// the elements are identified for both platforms, // greatly simplifying the tests in the case when the // interface is identical @iOSFindBy(name = “user”) @AndroidFindBy(id = PKG + “/signin_email_text”) private MobileElement userTxt; @iOSFindBy(name = “password”) @AndroidFindBy(id = PKG + “/signin_password_text”) private MobileElement passwdTxt; // we are using a xpath identification criterion, // here for teaching purposes @iOSFindBy(xpath = “//UIAStaticText [@name=’labelSignIn’]”) @AndroidFindBy(xpath = “//android.widget.TextView [@id=’” + PKG + “/signin_result’]”) private MobileElement labelTxt; @iOSFindBy(className = “UIAButton”) @AndroidFindBy(uiAutomator = “text(\”Sign In\”)”) private MobileElement signinBtn; public LoginScreen(final AppiumDriver driver) { this.driver = driver; // the identification criteria annotated above // are initialized PageFactory.initElements(new AppiumFieldDecorator(driver), this); // the test will wait for MAXIMUM 10 seconds for // the visibility of the user and password field final WebDriverWait wait = new WebDriverWait(driver, 10); wait.until(ExpectedConditions.visibilityOf(userTxt)); wait.until(ExpectedConditions. visibilityOf(passwdTxt)); } public void touchUserNameBox() { userTxt.click(); } public void typeUserName(String username) { userTxt.sendKeys(username); } public String getVisibleUsername() { return userTxt.getText(); } public void typePassword(String pwd) { passwdTxt.sendKeys(pwd); } public boolean isSignInBtnEnabled() { return signinBtn.isEnabled(); } public boolean hasErrorMessage() { return labelTxt.isDisplayed(); } public boolean isSignInBtnVisible() { return signinBtn.isDisplayed(); } public void touchSignInBtn() { signinBtn.click(); } public String getErrorMessage(){ return labelTxt.getText(); } }

The easy and usually inefficient approach is to write a test which reproduces these steps by identifying the objects the test interacts with, carrying out the actions and validation. The major problem of this approach (as a matter of fact, of the automated tests of this type in general, regardless of the used product) is the fact that test maintenance becomes difficult as their number increases. If we have 20 tests that carry out actions on the same items, the moment when one of the items changes, it will have to be modified in all the tests. Even if we extract the criteria used to identify the items from the tests and we make them members of the test class, maintenance is still rather difficult and, moreover, the test will have to be duplicated for both platforms. A more elegant and efficient approach in the case of big suites of tests is to separate the actual tests from the objects that model the screens of the application in the code. Basically, each screen will have a Java corresponding class, and there is also the The corresponding test will use public methods from the class possibility off separately shaping portions of the screen, in the above and, of course, assertions for validation: case of more complex screens. The criteria for item identification @Test are going to be the private members of the class, thus observing public void testLogin(){ screen = new LoginScreen(driver); the OOP philosophy, the actions and the characteristics of the LoginScreen assertEquals(screen.getVisibleUsername(), items will be exposed as public methods, easy to use in the actual “JohnDoe”); test. This pattern is similar to PageFactory used in the Selenium assertFalse(isSignInBtnEnabled()); screen.touchUserNameBox(); WebDriver tests. assertEquals(screen.getVisibleUsername(), “”); The class that will model the screen (actually the portion of screen.typeUserName(“TotallyValid”); screen.typePassword(“badpassword”); the screen containing the authentication elements) is presented assertTrue(screen.isSignInBtnEnabled()); www.todaysoftmag.com | no. 33/march, 2015

testing Appium – automated cross-platform testing for mobile devices screen.touchSignInBtn(); assertTrue(screen.hasErrorMessage()); assertEquals(screen.getErrorMessage(), “Invalid credentials”); assertTrue(screen.isSignInBtnVisible()); }

This organization of tests helps a lot in maintenance, as the modifications in UI often require only the modification of the identification criteria of the elements in the classes which model the screens of the application, the tests functioning correctly with no modification (of course, supposing there are no changes in the application flow). Another important advantage is that the test becomes easy to be read, one can easily notice the steps and validations. Of course, it becomes easy to implement BDD in order to further simplify the tests (where it can be applied) by using one of the tools available on the market or implementing a new variant. The identification criteria of the elements can be xpath, id, name, className and UIAutomator. This last criterion largely uses the IOS (predicates) and Android APIs in order to identify the elements when the other variants are not optimal. Sometimes, this type of identification requires a previous study of the specifications of the implementation (especially IOS predicates), offering great flexibility and equal complexity.

Conclusions

Appium has become, in a short time, as popular in testing mobile applications as Selenium WebDriver for web applications. There are many advantages on the list: open-source, the tested application is the final one (it doesn’t require any sort of intervention on the code in order to make the application testable), a great number of supported languages, high compatibility with the devices and versions of the operating systems, tests are written only once for both platforms (when the interface is similar) support for native/hybrid/web applications, big community. There are, of course, some disadvantages, too: the IOS tests can be run only on MAC machines and you cannot run more than one test simultaneously. This last disadvantage can be removed by using a cloud service such as Sauce Lab.

Vasile Pop

vasile.pop@intel.com Software Engineer @ Intel România

no. 33/march, 2015 | www.todaysoftmag.com

others

TODAY SOFTWARE MAGAZINE

Future-Oriented Technology Analysis (FTA)

started writing on this subject thinking about what the future technology will look like, how we can anticipate and prepare for the changes in the forthcoming years. Scientists are putting a lot of effort in researches today, to try to reveal the future of science, technology, economy and society in order to identity strategic areas which can generate huge benefits in our society and economy.

Future Oriented Technology Analysis or FTA is defined as an umbrella term for a broad set of activities that facilitate decision making and coordinated action, especially in science, technology and innovation policymaking (Eerola and Miles 2011). One might ask who is interested in this kind of analysis. Currently, FTA is an important topic on E.U. agenda or for multinational companies. Some of the big industries where FTA is largely used are energy and environment, information technology, e-commerce, manufacture and robotics, medicine and biogenetics, transportation and spatial.

Moving forward with my subject, I will give more details to what FTA means and what it implies. Some of the basic principles of FTA are: • Future oriented • Participation • Evidence-based • Multidisciplinary • Coordinated mobilization of people and resources • Action orientation The disciplines which form the FTA concept are foresight, forecasting, technology assessment, technical intelligence, road mapping. These contribute to a better understanding of future challenges and are

shaping sustainable solutions for the future. I will briefly define four of them and, then, I will detail the foresight discipline more. Forecasting – discipline which calculates and anticipates the flow of an event as well as pace of change. This is seen as a result of a rational study and a thorough analysis of data. An example would be a government decision to support a space program. This would have a major impact in electronics, introducing new materials etc. At the same time this decision could have a negative impact in terrestrial transportation as it might remain under financed as the money is limited. Technology assessment - is a systematic attempt to foresee the consequences and multiple scenarios of introducing a particular technology. It’s hard these days to separate daily people’s activities from technolog y. Therefore, we can define technology as the ways and means through which people produce goods. At the moment when technology changes, the whole system from which people are also part, might change. Technical intelligence – the second name for this would be competitive intelligence. We can define this as using intelligence in relation to technical possibilities of the competition. Road mapping – combining anticipated advances in technologies and products to generate plans.

www.todaysoftmag.com | no. 33/march, 2015

others Future-Oriented Technology Analysis (FTA) It is often used by companies for product planning. Foresight - the ability to predict what will happen or be needed in the future. This is the discipline that should be used only by experts in order to fully trust the analysis results. The domain experts are analyzing any information to identify potential foresights. At this level they set the strategy goals which implies also a participatory mechanism. Standard approaches will remain irrelevant and the plan is the least important element which adds to the end results. In this discipline it is important to mention that we do not discuss about a predetermined future, but about exploring the ways our future care evolve, depending on present actions and decisions. In the picture below we have the foresight context as a guideline.

To conclude such a complex subject as Future Oriented Technology Analysis we can say that this is an umbrella term for a broad set of disciplines (foresight, forecasting, technology assessment, technical intelligence, road mapping). These are used both by governments, research centers, NGOs as well as private companies which want to maintain their market share or raise their profits. It is highly valuable for our society that these disciplines facilitate decision making and streamline coordination around science, technology and innovation so that sustainable future policies can work in everyoneâ&#x20AC;&#x2122;s benefit.

References: 1. 2. 3.

Going one step deeper into our subject, you can study the generic process of foresight. As inputs, we have the things that are currently happening. Inside the actual foresight process, a detailed analysis takes place.

For each of FTA disciplines defined above, the analysts are using a set of combined methods (exploratory, advisory, explanatory and participatory). Because the space for this article is limited, for the curios ones who what to have a better picture of the methods of analysis used for foresight, you can search on the internet for Futures Diamond scheme

no. 33/march, 2015 | www.todaysoftmag.com

4. 5.

http://www.nesta.org.uk/publications/ quantitative-analysis-technology-futures-part-1 https://ec.europa.eu/jrc/en/event/site/fta2014 http://thinkingfutures.net/wp-content/uploads/2010/10/An-Overviewof-Foresight-Methodologies1.pdf http://www.techmonitor.net/tm/images/3/37/03jul_aug_sf2.pdf http://web.mit.edu/smadnick/www/wp/2008-15.pdf

Ioana Armean

ioanaa@imprezzio.com Business Analyst @ Imprezzio Global