x264 git snapshot proxy
x264のgitのsnapshotをrevision番号付きで取得・転送するCGIです。
x264.nlのchangelogを元に1commit=1revisionで作成しています。
注意
- プロキシ動作のため、ダウンロードはかなり遅いです。
- changelogはキャッシュしているため、表示が古い場合は下記のリンクで更新してみてください。
- キャッシュの更新は3分以上の間隔が必要、表示は最新300件です。
- 元々自分用に作った物なのでこのCGIのURLはよく変わります。
リンクは本体の方にしておくことをお薦めします。
- ここはプロキシ(中継)であり、x264の提供元ではありません。
以下でダウンロードできるものは、x264のgitから転送されているだけです。
- 当然、x264のgitまたはx264.nlの運用状況が変われば影響されます。
かなり固定的な処理をしているため、いつまでまともに動作するかは不明です。
キャッシュ更新用リンク
cached from : http://x264.nl/x264/changelog.txt
x264r2334
git-id : a3ac64b8b467eea1264c0053022893bc84b2e9a2
revision : r2334
Author : Anton Mitrofanov
Date: Mon May 6 22:51:11 2013 +0400
OpenCL support improvement/refactoring
Autoload the OpenCL library so that it's not required to run an openCL-enabled
build of x264.
Update X264_BUILD, which should have been changed with the first patch.
x264r2333
git-id : c47347c01eb4d9933e2d9705f44707dbb396f611
revision : r2333
Author : Jason Garrett-Glaser
Date: Thu May 16 13:51:37 2013 -0700
x86: shave a few instructions off AVX deblock
x264r2332
git-id : b547a4ea1169411610855002db9a8182b1e73314
revision : r2332
Author : Henrik Gramner
Date: Tue May 14 18:57:40 2013 +0200
x86: AVX2 dequant_4x4_dc
x264r2331
git-id : 907573d3f7873b7600cc94d1e287d52628e11766
revision : r2331
Author : Henrik Gramner
Date: Tue May 14 18:53:12 2013 +0200
x86: AVX2 high bit-depth dequant
x264r2330
git-id : 442c6a420f8727d2f4087e9f3f317fb1774b9262
revision : r2330
Author : Jason Garrett-Glaser
Date: Thu May 9 17:20:05 2013 -0700
x86-64: 64-bit variant of AVX2 hpel_filter
~5% faster than 32-bit.
x264r2329
git-id : 26a6451591cd7cd25fcfeeacee3850e5dd7a7f7e
revision : r2329
Author : Henrik Gramner
Date: Mon May 6 18:41:24 2013 +0200
x86: AVX2 high bit-depth denoise_dct
28->15 cycles
Also reorder instructions to use fewer registers, 3 cycles faster on Ivy Bridge with 64-bit Windows.
x264r2328
git-id : db95d6af63bec7839b3d3e1f2eb67b8689dc8170
revision : r2328
Author : Henrik Gramner
Date: Sat May 4 18:48:58 2013 +0200
x86: AVX2 high bit-depth quant
quant_4x4: 13->6 cycles
quant_4x4_dc: 14->8 cycles
quant_8x8: 47->24 cycles
quant_4x4x4: 48->25 cycles
x264r2327
git-id : 327386f70836507cb44266e5d71bd1d744fe3d78
revision : r2327
Author : Jason Garrett-Glaser
Date: Wed May 1 14:32:11 2013 -0700
x86: AVX2 add16x16_idct_dc
27 -> 19 cycles
x264r2326
git-id : c82db4ed07d4a69a84ac99d5e79e32f61141494f
revision : r2326
Author : Jason Garrett-Glaser
Date: Mon Apr 29 16:16:54 2013 -0700
x86: faster AVX2 quant_4x4x4
10->9 cycles
x264r2325
git-id : b79f4a6e460b00c85f0ee67b03299bf1d15dd48c
revision : r2325
Author : Jason Garrett-Glaser
Date: Sat Apr 27 21:03:32 2013 -0700
x86: AVX2 intra_sad_x3_8x8c
30->22 cycles
x264r2324
git-id : 2c0bca3f798e20133f61c3517202942e873e00d6
revision : r2324
Author : Henrik Gramner
Date: Sun Apr 28 11:11:03 2013 +0200
x86: AVX2 high bit-depth intra_sad_x3_8x8
43->24 cycles
x264r2323
git-id : b2c30e1a470181b591619b211ae0342e9cc8aac9
revision : r2323
Author : Jason Garrett-Glaser
Date: Wed Apr 24 14:22:15 2013 -0700
x86: AVX2 deblock strength
30->18 cycles
x264r2322
git-id : 37edf16c1955cfc9d2843024af0fa7aa6268ad90
revision : r2322
Author : Henrik Gramner
Date: Wed May 1 17:42:48 2013 +0200
x86: Faster high bit-depth intra_sad_x3_4x4
20->16 cycles on Ivy Bridge
x264r2321
git-id : a9ed051f2bc73c9bfeff006d7328bd2bc99ce147
revision : r2321
Author : Jason Garrett-Glaser
Date: Tue Apr 30 17:36:46 2013 -0700
x86: faster SSSE3 hpel
~7% faster using the pmulhrsw trick from mc_chroma.
x264r2320
git-id : 9373d5fa6e7a5cc5bcc756125cbc2e7fe058ea43
revision : r2320
Author : Jason Garrett-Glaser
Date: Mon Apr 29 14:22:23 2013 -0700
x86-64: faster SSSE3 trellis
~2% faster trellis.
x264r2319
git-id : 2a716040eb8b89efd92ea61ab08ecc41bf0b8623
revision : r2319
Author : Jason Garrett-Glaser
Date: Thu May 2 17:10:26 2013 -0700
x86: 32-byte align the stack if possible
Avoids the need for manual 32 byte array alignment on compilers that support
-mpreferred-stack-boundary.
x264r2318
git-id : eefaff1128ea9eb8dcd6796957ca5e56727337b8
revision : r2318
Author : Henrik Gramner
Date: Sat May 11 23:39:09 2013 +0200
x86inc: Utilize the shadow space on 64-bit Windows
Store XMM6 and XMM7 in the shadow space in functions that clobbers them.
This way we don't have to adjust the stack pointer as often,
reducing the number of instructions as well as code size.
x264r2317
git-id : b4be6e56629cf8fdcf53adc6b879969d8f6760b3
revision : r2317
Author : Henrik Gramner
Date: Fri May 3 23:06:10 2013 +0200
x86: Don't use explicitly aligned versions of SAD on AVX CPUs
On modern CPUs movdqu isn't slower than movdqa when used on aligned data and using the same code in both cases saves cache.
This was already done for the high bit-depth AVX2 implementation but the aligned version still exists as dead code so remove that.
x264r2316
git-id : 99f553ec300d928d23522304ebf4818574b85ed3
revision : r2316
Author : Henrik Gramner
Date: Fri May 3 20:18:03 2013 +0200
x86: Add missing initializations for high bit-depth sad_aligned
x264r2315
git-id : 42f2f78a05985a49fea0fb1bff050c95257810bb
revision : r2315
Author : Jason Garrett-Glaser
Date: Mon May 13 16:52:18 2013 -0700
x86: add Jaguar CPU detection
x264r2314
git-id : f12a17f5ecde41148256cb0c132cb31ac6602f3e
revision : r2314
Author : Henrik Gramner
Date: Tue May 7 17:21:03 2013 +0200
x86inc: Remove .rodata kludges
The Mach-O bug was fixed in yasm 0.8.0 and we don't support versions that old.
a.out was superseded by ELF on sane systems a few decades ago.
x264r2313
git-id : c3b166a6cf55afaeea5bbc94ebb275b92efbd3d8
revision : r2313
Author : Henrik Gramner
Date: Sat May 4 16:21:32 2013 +0200
checkasm: Use 64-bit cycle counters
Prevents overflows that can occur in some cases.
x264r2312
git-id : e943696e98ba9a75f5100c5692e39708ff2cc422
revision : r2312
Author : Henrik Gramner
Date: Fri May 10 13:55:32 2013 +0200
checkasm: Fix stack alignment bug
x264r2311
git-id : b1749e204d14087a768990e8bfe964d343e0b9a9
revision : r2311
Author : Jason Garrett-Glaser
Date: Wed May 8 10:48:41 2013 -0700
Fix invalid memcpy in sliced-threads
Likely didn't actually break in practice, but memcpy with src==dst
is incorrect.
x264r2310
git-id : 76a5c3a19f97cd34b65aeff050de4042b054bc65
revision : r2310
Author : Jason Garrett-Glaser
Date: Mon Apr 29 12:14:01 2013 -0700
Fix two bugs in slice-min-mbs and slices-max
Slices-max broke slice-max-size when slice-max wasn't used.
Slice-min-mbs broke in rare cases near the end of a threadslice.
x264r2309
git-id : 3b1f1f71459b54b976588b871edc7f459b4d0434
revision : r2309
Author : Jason Garrett-Glaser
Date: Thu Apr 4 18:00:23 2013 -0700
x86: SSSE3 LUT-based faster coeff_level_run
~2x faster coeff_level_run.
Faster CAVLC encoding: {1%,2%,7%} overall with {superfast,medium,slower}.
Uses the same pshufb LUT abuse trick as in the previous ads_mvs patch.
x264r2308
git-id : c05bf544b659510b9008c1037fd8887e8917d30c
revision : r2308
Author : Jason Garrett-Glaser
Date: Mon Mar 25 14:03:37 2013 -0700
x86-64: BMI2 cabac_residual functions
x264r2307
git-id : 437f808579754b5674fb6183331e8ca9bcf53647
revision : r2307
Author : Jason Garrett-Glaser
Date: Wed Mar 20 15:08:35 2013 -0700
x86: SSSE3 ads_mvs
~55% faster ads in benchasm, ~15-30% in real encoding.
~4% faster "placebo" preset overall.
x264r2306
git-id : 2ad961f2d6fc681db6fc87f2c0ca68ff2a00e65e
revision : r2306
Author : Henrik Gramner
Date: Tue Apr 16 23:27:53 2013 +0200
x86: AVX2 pixel_ssd_nv12_core
x264r2305
git-id : 40406b804105964d6b5abea38833d69f6d617815
revision : r2305
Author : Henrik Gramner
Date: Tue Apr 16 23:27:50 2013 +0200
x86: AVX2 high bit-depth pixel_ssd
x264r2304
git-id : c2852be748c66f1ff25f38133d5efbd6059bed6c
revision : r2304
Author : Henrik Gramner
Date: Tue Apr 16 23:27:46 2013 +0200
x86: AVX2 high bit-depth pixel_sad_x3/pixel_sad_x4
Also reduce the number of xmm registers used by sse2/ssse3 pixel_sad_x3.
x264r2303
git-id : fa9dcd02ea386e46314eb0c518b0b5763ef73c80
revision : r2303
Author : Henrik Gramner
Date: Tue Apr 16 23:27:43 2013 +0200
x86: AVX2 high bit-depth vsad
x264r2302
git-id : 567d03619b0af415362454eb20066e0167266a43
revision : r2302
Author : Henrik Gramner
Date: Tue Apr 16 23:27:39 2013 +0200
x86: AVX2 high bit-depth pixel_sad
Also use loops instead of duplicating code; reduces code size by ~10kB with
negligible effect on performance.
x264r2301
git-id : 6cc9f169844cc84a7da8cc4fbf08a3f5dea86c63
revision : r2301
Author : Henrik Gramner
Date: Tue Apr 16 23:27:35 2013 +0200
x86: AVX2 high_bit_depth pixel_avg2, get_ref, mc_copy_w16, mc_luma
Also reduce the number of xmm registers used by mc_copy_* to avoid
saving and restoring xmm6 and xmm7 on 64-bit Windows.
x264r2300
git-id : c3711285a6dd1343197ac3e53bb95acf99c6cb42
revision : r2300
Author : Henrik Gramner
Date: Tue Apr 16 23:27:32 2013 +0200
x86: AVX2 nal_escape
Also rewrite the entire function to be faster and drop the AVX version which is no longer useful.
x264r2299
git-id : 255271fd7999b6b7ff7d65b7b8de1a2dc8919b1a
revision : r2299
Author : Henrik Gramner
Date: Tue Apr 16 23:27:29 2013 +0200
x86: AVX memzero_aligned
x264r2298
git-id : 43632cc8a9115c076204f46e31a5d5c3e58bf934
revision : r2298
Author : Henrik Gramner
Date: Tue Apr 16 23:27:25 2013 +0200
x86: AVX2 predict_16x16_dc
x264r2297
git-id : dcad117131f0e0b5032bf5ca8c27def7fcdce17f
revision : r2297
Author : Henrik Gramner
Date: Tue Apr 16 23:27:22 2013 +0200
x86: AVX2 predict_8x8c_p/predict_8x16c_p
x264r2296
git-id : f5bff68b16e3125dc95705d060c89935a298f0ff
revision : r2296
Author : Henrik Gramner
Date: Tue Apr 16 23:27:18 2013 +0200
x86: AVX2 predict_16x16_p
Also fix the AVX implementation to correctly use the SSSE3 inline asm
instead of SSE2.
x264r2295
git-id : 92eb201b65cb9338500135bda1e2ee4d6861727c
revision : r2295
Author : Henrik Gramner
Date: Tue Apr 16 23:27:14 2013 +0200
x86: AVX high bit-depth predict_16x16_v
Also restructure some code to reduce code size of various functions,
especially in high bit-depth.
x264r2294
git-id : 16f3261076c7159aeea902e68ca064c6d0a2cfd8
revision : r2294
Author : Henrik Gramner
Date: Tue Apr 16 23:27:08 2013 +0200
x86: AVX2 high bit-depth predict_4x4_h
x264r2293
git-id : a38b5fc6ec7348342d8ee4ff21abf3e82c5f7bbf
revision : r2293
Author : Henrik Gramner
Date: Tue Apr 16 23:27:04 2013 +0200
x86: AVX2 high bit-depth predict_16x16_h
x264r2292
git-id : 89f8263b141492a3b45274616fa0327289329c26
revision : r2292
Author : Henrik Gramner
Date: Tue Apr 16 23:27:00 2013 +0200
x86: AVX2 high bit-depth predict_8x8c_h/predict_8x16c_h
x264r2291
git-id : 78b8af872f49aeaa3727ac4e0c8d3b53f0716f51
revision : r2291
Author : Henrik Gramner
Date: Tue Apr 16 23:26:47 2013 +0200
x86util: Support ymm registers in HADD macros
x264r2290
git-id : d07b421cf19fc4d77f0bff9d4d6b11db27d81374
revision : r2290
Author : Jason Garrett-Glaser
Date: Tue Feb 26 16:26:34 2013 -0800
x86: more AVX2 framework, AVX2 functions, plus some existing asm tweaks
AVX2 functions:
mc_chroma
intra_sad_x3_16x16
last64
ads
hpel
dct4
idct4
sub16x16_dct8
quant_4x4x4
quant_4x4
quant_4x4_dc
quant_8x8
SAD_X3/X4
SATD
var
var2
SSD
zigzag interleave
weightp
weightb
intra_sad_8x8_x9
decimate
integral
hadamard_ac
sa8d_satd
sa8d
lowres_init
denoise
x264r2289
git-id : e228f65488b02967bc450bbe3b92ac44eb0088d7
revision : r2289
Author : Loren Merritt
Date: Mon Feb 25 21:16:45 2013 +0000
x86inc: create xm# and ym#, analagous to m#
For when we want to mix simd sizes within one function.
x264r2288
git-id : e916dfb774059bc2b63dfe88e32fa21f51abd2b7
revision : r2288
Author : Jason Garrett-Glaser
Date: Fri Apr 5 16:08:35 2013 -0700
x86inc: fix AVX emulation of cmp(p|s)(s|d)
x264r2287
git-id : 5b01ce105051144c4dd91866e1642cc8d7926c89
revision : r2287
Author : Jason Garrett-Glaser
Date: Tue Feb 5 17:15:00 2013 -0800
x86-64: cabac_block_residual assembly
RDO: ~20% faster than C
Bitstream: ~50% faster than C
1-2% faster overall, highest on preset superfast/fast/medium.
x264r2286
git-id : 3a5f6c0aeacfcb21e7853ab4879f23ec8ae5e042
revision : r2286
Author : Steve Borho
Date: Thu Feb 21 12:48:40 2013 -0600
OpenCL lookahead
OpenCL support is compiled in by default, but must be enabled at runtime by an
--opencl command line flag. Compiling OpenCL support requires perl. To avoid
the perl requirement use: configure --disable-opencl.
When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU
device. Lowres intra cost prediction, lowres motion search (including subpel)
and bidir cost predictions are all done on the GPU. MB-tree and final slice
decisions are still done by the CPU. Presets which do not use a threaded
lookahead will not use OpenCL at all (superfast, ultrafast).
Because of data dependencies, the GPU must use an iterative motion search which
performs more total work than the CPU would do, so this is not work efficient
or power efficient. But if there are spare GPU cycles to spare, it can often
speed up the encode. Output quality when OpenCL lookahead is enabled is often
very slightly worse in quality than the CPU quality (because of the same data
dependencies).
x264 must compile its OpenCL kernels for your device before running them, and in
order to avoid doing this every run it caches the compiled kernel binary in a
file named x264_lookahead.clbin (--opencl-clbin FNAME to override). The cache
file will be ignored if the device, driver, or OpenCL source are changed.
x264 will use the first GPU device which supports the required cl_image
features required by its kernels. Most modern discrete GPUs and all AMD
integrated GPUs will work. Intel integrated GPUs (up to IvyBridge) do not
support those necessary features. Use --opencl-device N to specify a number of
capable GPUs to skip during device detection.
Switchable graphics environments (e.g. AMD Enduro) are currently not supported,
as some have bugs in their OpenCL drivers that cause output to be silently
incorrect.
Developed by MulticoreWare with support from AMD and Telestream.
x264r2285
git-id : e436158903c6171ce4abe78f03f013fe04f193bd
revision : r2285
Author : Jason Garrett-Glaser
Date: Mon Mar 4 15:19:47 2013 -0800
weightp: improve scale/offset search, chroma
Rescale the scale factor if the offset clips. This makes weightp more effective
in fades to/from white (and an other situation that requires big offsets).
Search more than 1 scale factor and more than 1 offset, depending on --subme.
Try to find the optimal chroma denominator instead of hardcoding it.
Overall improvement: a few percent in fade-heavy clips, such as a sample from
Avatar: TLA.
x264r2284
git-id : 389d06e8f93916b4fe5766ee4503380f2632ef79
revision : r2284
Author : Jason Garrett-Glaser
Date: Tue Feb 19 13:48:44 2013 -0800
Add slices-max feature
The H.264 spec technically has limits on the number of slices per frame. x264
normally ignores this, since most use-cases that require large numbers of
slices prefer it to. However, certain decoders may break with extremely large
numbers of slices, as can occur with some slice-max-size/mbs settings.
When set, x264 will refuse to create any slices beyond the maximum number,
even if slice-max-size/mbs requires otherwise.
x264r2283
git-id : f546e98eb8f9afd15fb7e8f95ec02fcf65155079
revision : r2283
Author : Jason Garrett-Glaser
Date: Thu Feb 14 17:22:02 2013 -0800
Add slice-min-mbs feature
Works in conjunction with slice-max-mbs and/or slice-max-size to avoid overly
small slices.
Useful with certain decoders that barf on extremely small slices.
If slice-min-mbs would be violated as a result of slice-max-size, x264 will
exceed slice-max-size and print a warning.
x264r2282
git-id : 1db46210d525856a8f9e59944913127287d956c5
revision : r2282
Author : Anton Mitrofanov
Date: Tue Mar 26 18:56:21 2013 +0400
Disable mbtree asm with cpu-independent option
Results vary between versions because of different rounding results.
x264r2281
git-id : fceb3b197f5fcaded3943718c162b662b52b208f
revision : r2281
Author : Anton Mitrofanov
Date: Tue Mar 26 18:30:00 2013 +0400
Show "avs: no" --disable-avs option instead of empty string
x264r2280
git-id : 68ee80a51f6f1de78877a9907e3efcbb1fe13ac6
revision : r2280
Author : Tim Walker
Date: Tue Mar 19 23:42:43 2013 +0100
lavf input: don't use deprecated AVStream fields
Fixes building against newer libavcodecs from the Libav project.
x264r2279
git-id : 5980580d5a4d32eebf32b2f274807dd4aa68836b
revision : r2279
Author : Anton Mitrofanov
Date: Tue Mar 26 19:54:36 2013 +0400
Fix y4m input with C420paldv colorspace
x264r2278
git-id : 580cc69707f6996dad8544d6ef0d5a8bbc1b5864
revision : r2278
Author : Jason Garrett-Glaser
Date: Sat Mar 2 01:22:29 2013 -0800
x86: correctly check stack alignment for Atom hadamard_ac
Regression in r2265 (only affected compilers with broken stack alignment,
like ICL on win32).
x264r2277
git-id : b3c15fcf677a4ceb59c8f4adc39dc93ecd06ff8a
revision : r2277
Author : Loren Merritt
Date: Mon Feb 25 21:23:55 2013 +0000
x86inc: fix some corner cases of SWAP
SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation
x264r2276
git-id : 89aecb440e2939be7fb72d8362eb12504711b94f
revision : r2276
Author : Jason Garrett-Glaser
Date: Wed Feb 27 13:30:22 2013 -0800
Fix array overreads that caused miscompilation in gcc 4.8
x264r2275
git-id : e355b0e12d6cb380c13cdce15b42093eb8eeef44
revision : r2275
Author : Jason Garrett-Glaser
Date: Thu Feb 28 13:32:37 2013 -0800
Fix undefined behavior in x264_ratecontrol_mb
x264r2274
git-id : c832fe995bf3d41cae1d3d22e10cb2288e8a650a
revision : r2274
Author : Stefan Groenroos
Date: Fri Mar 1 22:35:34 2013 +0200
ARM: Fix bug in x264_quant_4x4x4_neon
Regression in r2273.
x264r2273
git-id : b3065e660df391168067f13216d99825260939d4
revision : r2273
Author : Stefan Groenroos
Date: Mon Feb 25 23:43:09 2013 +0200
ARM: update NEON mc_chroma to work with NV12 and re-enable it
Up to 10-15% faster overall.
x264r2272
git-id : e82cf2c8e3bc0d7623f3e8ed9a4684bc3dc40b91
revision : r2272
Author : Jason Garrett-Glaser
Date: Thu Feb 14 15:00:48 2013 -0800
CABAC/CAVLC: use the new bit-iterating macro here too
x264r2271
git-id : 253e2c3f7eab79d74450de4f88a8bf451fd01be4
revision : r2271
Author : Jason Garrett-Glaser
Date: Fri Feb 8 15:34:38 2013 -0800
quant_4x4x4: quant one 8x8 block at a time
This reduces overhead and lets us use less branchy code for zigzag, dequant,
decimate, and so on.
Reorganize and optimize a lot of macroblock_encode using this new function.
~1-2% faster overall.
Includes NEON and x86 versions of the new function.
Using larger merged functions like this will also make wider SIMD, like
AVX2, more effective.
x264r2270
git-id : eaae05ea3f104dc9fa948327e10649ec693adf0e
revision : r2270
Author : Stephen Hutchinson
Date: Tue Feb 12 21:55:43 2013 -0500
Add AvxSynth support to the AviSynth input module.
Uses dlopen to load AvxSynth on Linux and OS X.
Allows the use of --demuxer avs for AvxSynth, though the only source filter it
can currently use is FFMS2.
Add a local copy of avxsynth_c.h and its dependent headers in extras/ so that
users don't need to actually have AvxSynth development headers installed to
enable support for it (mirroring the AviSynth behavior).
Based on a patch by 0x09 (tab@lavabit.com)
x264r2269
git-id : b2c70f6548a68b874006a176d48cd0ca4e03859a
revision : r2269
Author : Jason Garrett-Glaser
Date: Fri Feb 8 00:13:15 2013 -0800
Eliminate some branchiness in ME/analysis
Faster, fewer branch mispredictions.
x264r2268
git-id : 9d600d64194e0b2a77a8d9aa3f05b141cf473af0
revision : r2268
Author : Jason Garrett-Glaser
Date: Wed Feb 6 16:55:39 2013 -0800
Fix some store forwarding stalls
There's quite a few others, but most of them don't help to fix or there's no
easy way to avoid them.
x264r2267
git-id : 9fe40b1e0db6cd93652e3a45dbbd8f24dbe0b70e
revision : r2267
Author : Jason Garrett-Glaser
Date: Tue Feb 5 01:23:23 2013 -0800
x86: faster AVX satd/sa8d/sa8d_satd/hadamard_ac
Use Conroe-style movddup in AVX transforms; both Sandy Bridge and Bulldozer
do movddup in the load unit, so it's totally free this way.
On Sandy Bridge:
~6% faster sa8d_satd
~5% faster hadamard_ac
~9% faster 32-bit satd
~2% faster sa8d
x264r2266
git-id : 4f24bb34453fdedefd161063e20516d148b80f8b
revision : r2266
Author : Jason Garrett-Glaser
Date: Sat Feb 2 12:37:08 2013 -0800
x86: detect Bobcat, improve Atom optimizations, reorganize flags
The Bobcat has a 64-bit SIMD unit reminiscent of the Athlon 64; detect this
and apply the appropriate flags.
It also has an extremely slow palignr instruction; create a flag for this to
avoid massive penalties on palignr-heavy functions.
Improve Atom function selection and document exactly what the SLOW_ATOM flag
covers.
Add Atom-optimized SATD/SA8D/hadamard_ac functions: simply combine the ssse3
optimizations with the sse2 algorithm to avoid pmaddubsw, which is slow on
Atom along with other SIMD multiplies.
Drop TBM detection; it'll probably never be useful for x264.
Invert FastShuffle to SlowShuffle; it only ever applied to one CPU (Conroe).
Detect CMOV, to fail more gracefully when run on a chip with MMX2 but no CMOV.
x264r2265
git-id : d556d5540ab90b2c89a5ba0cd6ce393f87c19faf
revision : r2265
Author : Oskar Arvidsson
Date: Sat Jan 19 01:47:09 2013 +0100
x86: combined SA8D/SATD dsp function
Speedup is most apparent for 8-bit (~30%), but gives some improvements
for 10-bit too (~12%).
64-bit only for now.
x264r2264
git-id : 5c2ca5dee339a215cb331c426d40fa548675f088
revision : r2264
Author : Oskar Arvidsson
Date: Tue Jan 29 23:44:32 2013 +0100
x86: port SSE2+ SATD functions to high bit depth
Makes SATD 20-50% faster across all partition sizes but 4x4.
x264r2263
git-id : b09bc0cc936751f6ad1f20f5e11f523f6051ebc3
revision : r2263
Author : Oskar Arvidsson
Date: Wed Feb 6 02:07:53 2013 +0100
x86: faster high bit depth ssd
About 15% faster on average.
x264r2262
git-id : 91049858f8a051e87efcbe97285657fa3ef9a639
revision : r2262
Author : Jason Garrett-Glaser
Date: Fri Jan 18 22:55:46 2013 -0800
x86: optimize and clean up predictor checking
Branchlessly handle elimination of candidates in MMX roundclip asm.
Add a new asm function, similar to roundclip, except without the round part.
Optimize and organize the C code, and make both subme>=3 and subme<3 consistent.
Add lots of explanatory comments and try to make things a little more understandable.
~5-10% faster with subme>=3, ~15-20% faster with subme<3.
x264r2261
git-id : a216e5c92a1543e5d748928f7531cfd771739cbf
revision : r2261
Author : Jason Garrett-Glaser
Date: Tue Jan 22 12:31:55 2013 -0800
Fix two bugs in predictor checking
pmv wasn't checked properly in some cases, as well as zero vector.
Output-changing portion of the following patch.
x264r2260
git-id : 4d220bc18cb177b6812c381e7fb808f9ae3189e1
revision : r2260
Author : Jason Garrett-Glaser
Date: Thu Jan 10 13:15:52 2013 -0800
Improve lookahead-threads auto selection
Smarter decision to improve fast-first-pass performance in 2-pass encodes.
Dramatically improves CPU utilization on multi-core systems.
Tested on a quad-core Ivy Bridge (12 threads, 1080p):
Fast first pass:
veryfast: ~7% faster
faster: ~11% faster
fast/medium: ~15% faster
slow/slower: ~42% faster
veryslow: ~55% faster
CRF/1-pass:
veryfast: ~9% faster
(all others remained the same)
x264r2259
git-id : c63a518d43bb3822342513eb4af109551e86fbd2
revision : r2259
Author : Henrik Gramner
Date: Sun Jan 27 23:01:59 2013 +0100
x86: Use SSE instead of SSE2 for copying data
Reduces code size because movaps/movups is one byte shorter than movdqa/movdqu.
Also merge MMX and SSE versions of memcpy_aligned into a single macro.
x264r2258
git-id : 0ce5b431b94f3934a7229ab264c12f1106e4330d
revision : r2258
Author : Henrik Gramner
Date: Sun Jan 13 18:27:08 2013 +0100
64-bit cabac optimizations
~4% faster PIC
WIN64:
~3% faster and 16 byte shorter cabac_encode_bypass
~8% faster cabac_encode_terminal
Benchmarked on Ivy Bridge
UNIX64:
One instruction less in cabac_encode_bypass
x264r2257
git-id : 51a5976144d80d9dc178fcaba2da5224809ee6ba
revision : r2257
Author : Mike Gorchak
Date: Sat Feb 2 23:35:00 2013 -0800
configure: add QNX support
x264r2256
git-id : 486ff39f398401d126fbf0379287b1a7ca7fae6e
revision : r2256
Author : Henrik Gramner
Date: Sun Jan 20 19:35:06 2013 +0100
Windows: Enable DEP and ASLR
x264r2255
git-id : 8da42b78154304ef194747a375a7e1ff3021d0a9
revision : r2255
Author : Henrik Gramner
Date: Thu Jan 17 19:17:24 2013 +0100
x86inc: Set ELF hidden visibility for global constants
x264r2254
git-id : 989019209b2ccc828480f0e1f506747703134db3
revision : r2254
Author : Diego Biurrun
Date: Thu Jan 17 11:18:31 2013 +0100
x86inc: Add cvisible macro for C functions with public prefix
This allows defining externally visible library symbols.
Signed-off-by: Diego Biurrun
x264r2253
git-id : 91b0f0e6415b9cc56b625eb77dd5b471a59d3230
revision : r2253
Author : Diego Biurrun
Date: Thu Jan 17 11:30:37 2013 -0800
x86inc: rename program_name to private_prefix
Synced from libav.
The new name is more descriptive and will allow defining a separate public
prefix for externally visible library symbols.
x264r2252
git-id : a4e77598d2e1e55483bf0918f6ec2fda51ee9507
revision : r2252
Author : Jason Garrett-Glaser
Date: Mon Jan 14 05:35:30 2013 -0800
x264.h: improve x264_encoder_reconfig documentation
x264r2251
git-id : ce9efeafaad38bc6795d4469c952af2d5bb75a84
revision : r2251
Author : Henrik Gramner
Date: Sat Feb 16 19:36:50 2013 +0100
Cosmetics: stricter definition of parameterless functions
x264r2250
git-id : e403db4f9079811f5a1f9a1339e7c85b41800ca7
revision : r2250
Author : Neil
Date: Mon Jan 28 10:47:38 2013 +0800
Update "Install and compile x264" in doc/regression_test.txt
x264r2249
git-id : c13fbaf279d41e6bb8db09e95aec1b638ff026e8
revision : r2249
Author : Anton Mitrofanov
Date: Thu Jan 24 12:11:26 2013 +0400
Fix possible non-determinism with mbtree + open-gop + sync-lookahead
Code assumed keyframe analysis would only pull one frame off the list; this
isn't true with open-gop.
x264r2248
git-id : 736d69b5875587b61c03aa45438e19ddba1f7035
revision : r2248
Author : Anton Mitrofanov
Date: Mon Feb 25 19:28:19 2013 +0400
x86: don't use the red zone on win64
x264r2247
git-id : 637005ebef9f36b816a9777183660ea17f5b249d
revision : r2247
Author : Jason Garrett-Glaser
Date: Sun Feb 10 16:12:34 2013 -0800
x86-64: fix trellis asm with interlacing
Regression in r2145.
Assembly assumed array was [2][64] when it was actually [2][63].
Tiny (~0.1%) compression improvement.
x264r2246
git-id : ba5ce76f7506b5f3d083a9eda8c4705e192f15ff
revision : r2246
Author : Ronald S. Bultje
Date: Wed Jan 30 09:48:14 2013 -0800
x86-32: use simple nop codes for <= sse
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs (flags:
fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse up rng
rng_en ace ace_en) SIGILLs on long nop codes.
x264r2245
git-id : bc13772e21d0e61dea6ba81d0d387604b5b360df
revision : r2245
Author : Loren Merritt
Date: Tue Jan 8 21:30:57 2013 +0000
Bump dates to 2013
x264r2244
git-id : 3508f4a1446c408dcc0febe1a349ad303ae6628c
revision : r2244
Author : Henrik Gramner
Date: Mon Dec 17 21:54:00 2012 +0100
x86inc: Drop tzcnt workaround
It is no longer needed now that we've bumped the version requirement of yasm to 1.2.0.
x264r2243
git-id : b924133cabd125286488e16cfa71488ad4105d63
revision : r2243
Author : Jason Garrett-Glaser
Date: Mon Nov 12 10:28:53 2012 -0800
AVX2/FMA3 version of mbtree_propagate
First AVX2 function for testing.
Bump yasm version to 1.2.0 for AVX2 support.
x264r2242
git-id : d967c09cd93a230e03ec1e0f0f696975d15a01c0
revision : r2242
Author : Henrik Gramner
Date: Tue Dec 11 16:05:34 2012 +0100
x86inc: Use VEX-encoded instructions in AVX functions
Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4 functions for all instructions that exists in a VEX-encoded version.
This change makes it easier to extend existing code to use AVX2.
Also add support for AVX emulation of a few instructions that were missing before.
x264r2241
git-id : f6c628650558803ed65cb15c1853113cc589ae4a
revision : r2241
Author : Loren Merritt
Date: Sun Dec 2 15:56:30 2012 +0000
x86inc: activate REP_RET automatically
Now RET checks whether it immediately follows a branch, so the programmer dosen't have to keep track of that condition.
REP_RET is still needed manually when it's a branch target, but that's much rarer.
The implementation involves lots of spurious labels, but that's ok because we strip them.
x264r2240
git-id : 755fece365c14135c2621585e761f5dfeedefc74
revision : r2240
Author : Ronald S. Bultje
Date: Thu Dec 6 15:40:13 2012 -0800
x86inc: support stack mem allocation and re-alignment in PROLOGUE
Use this in 8-bit loopfilter functions so they can be used if
there is no aligned stack (e.g. x86-32 MSVC or ICC 10.x).
x264r2239
git-id : c69e2d02c4a2ee171b6b8ca0a2e1032213e561bc
revision : r2239
Author : Henrik Gramner
Date: Mon Dec 17 22:15:02 2012 +0100
Update config.guess and config.sub
x264r2238
git-id : 9c4ba4bde8965571159eae2d79f85cabbb47416c
revision : r2238
Author : Anton Mitrofanov
Date: Tue Jan 8 13:29:49 2013 -0800
Fix crash if the first frame is forced to a non-keyframe
This is obviously bad user input, but x264 shouldn't crash if it happens.
x264r2237
git-id : 593e8cc0b374aa7b20d3d961c57feb9bab508979
revision : r2237
Author : Bernhard Rosenkr辰nzer
Date: Sun Dec 30 12:18:00 2012 -0800
Fix build on ARM with binutils >= 2.23.51.0.6
GAS doesn't seem to like spaces in vld1 anymore, so remove those.
x264r2236
git-id : 55b5162d7ad9a70e2b6ae5ba3f743a35c2135aaf
revision : r2236
Author : Anton Mitrofanov
Date: Fri Nov 23 18:26:53 2012 +0400
Fix pthread_join emulation on win32 and BeOS
Doesn't actually affect x264, but it's more correct.
x264r2235
git-id : 0059dcf938451134d8f9c8f1ad522a2c6071e7cd
revision : r2235
Author : Jason Garrett-Glaser
Date: Tue Nov 27 07:50:51 2012 -0800
Fix typo in r2222
Slightly wrong numbers in level table.
x264r2234
git-id : 28ee1f47ed4366351477065a0f794f05402e69a7
revision : r2234
Author : Sergio Basto
Date: Thu Nov 22 18:02:50 2012 -0800
configure: fix gpac detection with -Wp,-D_FORTIFY_SOURCE=2
x264r2233
git-id : 67a69c06d7bd7907b5d1e058a26284c06baa93d1
revision : r2233
Author : Sean McGovern
Date: Thu Nov 22 18:01:16 2012 -0800
Solaris: use sysconf to get processor count
Solaris responds correctly to the same value as Cygwin, so let's use that.
x264r2232
git-id : 6e68ab73908f339cdd91c40943fef46fd1f832fa
revision : r2232
Author : Anton Khirnov
Date: Tue Nov 13 21:01:24 2012 +0100
lavf input: allocate AVFrame correctly
Allocate AVFrames correctly with avcodec_alloc_frame().
This caused crashes with newer libavcodecs that try to free frame extradata.
x264r2231
git-id : a632fe1a57baccdf1bcb340197fe48281cd3117f
revision : r2231
Author : Anton Mitrofanov
Date: Sun Nov 11 03:44:02 2012 +0400
Fix crash when using libx264.dll compiled with ICL for X86_64
x264r2230
git-id : 1cffe9f406cc54f4759fc9eeb85598fb8cae66c7
revision : r2230
Author : Anton Mitrofanov
Date: Fri Nov 9 02:31:10 2012 +0400
Fix possible issues with out-of-spec QP values
Fixes a possible regression in r2228.
x264r2229
git-id : 349b9bdefae84b006c4bdb7e07290b88a18bbbb2
revision : r2229
Author : Jason Garrett-Glaser
Date: Wed Sep 26 13:49:02 2012 -0700
Attempt to optimize PPS pic_init_qp in 2-pass mode
Small compression improvement; up to ~0.5% in extreme cases.
Helps more with small slice sizes (tiny resolutions or slice-max-size).
Note that this changes the 2-pass stats file format.
x264r2228
git-id : 8437d0db5de43cf9cd11e02444c80984935e25dc
revision : r2228
Author : Jason Garrett-Glaser
Date: Wed Sep 26 13:05:00 2012 -0700
Improve slice header QP selection
Use the first macroblock of each slice instead of the last of the previous.
Lets us pick a reasonable initial QP for the first slice too.
Slightly improved compression.
x264r2227
git-id : d2d8364ff48f789ef92135d24c6f185c4eccbeba
revision : r2227
Author : Jason Garrett-Glaser
Date: Thu Oct 11 13:27:48 2012 -0700
Update level dpb size calculation to match newer H.264 spec
Doesn't actually change encoding behavior, but makes it more correct.
Warning messages should now be accurate at higher bit depths and non-4:2:0.
Technically, since it redefines x264_level_t, this is an API version increment.
x264r2226
git-id : 28ddb0dd533154b58f9147932fb1dec4c74127c8
revision : r2226
Author : Jan Ekstr旦m
Date: Sun Oct 7 21:12:05 2012 +0300
Add support for the ffmpeg/vapoursynth high bit depth y4m extensions
x264r2225
git-id : 64cbe75cf3cae2bbc8fb34bcda5a9742d22f83f2
revision : r2225
Author : Diego Biurrun
Date: Tue Nov 6 14:48:56 2012 +0100
x86inc: Rename 3dnow2 to 3dnowext
The name "3dnowext" is more common than "3dnow2". Doesn't affect x264.
x264r2224
git-id : f418867a8b76f31acf3a965eed34c5587294e948
revision : r2224
Author : Diego Biurrun
Date: Wed Oct 31 12:23:54 2012 -0700
x86inc: only define program_name if the macro is unset.
This allows overriding the value from outside the file.
This can be useful if x86inc.asm is used outside of x264.
x264r2223
git-id : f6a8615ab0c922ac2cb5c82c9824f6f4742b1725
revision : r2223
Author : David Wolstencroft
Date: Mon Oct 29 09:07:39 2012 -0700
Disable ARM NEON MRC CPU test for Apple devices
The Apple A6 CPU doesn't support performance counters, so this test caused a crash.
x264r2222
git-id : 6889f2cee49314aa380d4803991d645659efc01f
revision : r2222
Author : Jason Garrett-Glaser
Date: Tue Nov 6 12:03:20 2012 -0800
Fix crash with no-scenecut + mbtree
x264r2221
git-id : 4dbfcd462ccdf065654d17c47e1d05d53f213bf1
revision : r2221
Author : Anton Mitrofanov
Date: Fri Oct 12 23:43:40 2012 +0400
Fix reconfiguring to crf=0
Lossless mode can't currently be enabled mid-stream.
x264r2220
git-id : 2ec8c64580efc10bbfc343d4bec2cf6bbb7d68c7
revision : r2220
Author : Derek Buitenhuis
Date: Mon Sep 17 11:09:20 2012 -0700
Fix ALIGNED_ARRAY_EMU macros on ICL
ICL's preprocessor doesn't handle it correctly.
This fix is similar to libav's fix in 0db2d9.
x264r2219
git-id : 2f154ac1652000afe16140cb12c35d777f0c60c8
revision : r2219
Author : Jason Martens
Date: Thu Sep 13 11:20:40 2012 -0700
Fix use of deprecated av_close_input_file call
x264r2218
git-id : 9fc00654018ff9f8a13dbe66785e31568a0c3229
revision : r2218
Author : Brad Smith
Date: Wed Sep 26 14:13:27 2012 -0700
Fix pkg-config for dynamic vs static linking
x264r2217
git-id : b22f22fdb1f6d61ccc7b0c867b530322ea681133
revision : r2217
Author : Brad Smith
Date: Mon Sep 10 17:52:04 2012 -0700
Set libm in the configure script if the OS has libm
Prerequisite for another configure patch after this.
Idea copied from libpthread.
x264r2216
git-id : 198a7ea13ccb727d4ea24b29f5da9b0292387309
revision : r2216
Author : Jason Garrett-Glaser
Date: Thu Aug 16 13:40:32 2012 -0700
Enhance mb_info: add mb_info_update
This feature lets the callee know which decoded macroblocks have changed.
x264r2215
git-id : f6e9002dd03329b69ea56391b3f4197efca7a690
revision : r2215
Author : Jason Garrett-Glaser
Date: Thu Aug 16 13:01:17 2012 -0700
Fix mb_info_free with sliced threads
x264 would free mb_info before it was completely done using it.
x264r2214
git-id : de725e98eb87198542aae5b8c5ebab4f6c06446e
revision : r2214
Author : Jason Garrett-Glaser
Date: Tue Aug 7 12:43:26 2012 -0700
Enhance nalu_process
Add the input frame opaque pointer to the arguments.
This makes it easier to use with multiple simultaneous x264 encodes.
x264r2213
git-id : 174cfac6344a9fad1577cd1f449b7d0e625d6e28
revision : r2213
Author : Jason Garrett-Glaser
Date: Mon Aug 6 14:55:35 2012 -0700
Improve mb_info constant mb optimization
Allow fast skipping even if the pskip MV isn't zero.
x264r2212
git-id : f57e7070d949b02e1a548382a549c34cf491e05e
revision : r2212
Author : Jason Garrett-Glaser
Date: Mon Jul 30 12:58:34 2012 -0700
Export the average effective CRF of each frame
Useful to judge the resulting quality of a frame when VBV is enabled.
x264r2211
git-id : d7fd6cc060b6ae3f3bcb9e09fc8bf532a8ed3a82
revision : r2211
Author : Brad Smith
Date: Mon Aug 20 23:58:19 2012 -0700
Remove special-casing for OpenBSD pthread handling
Previously it was policy to use -pthread, but OpenBSD now recommends -lpthread.
its been libpthread anyway and policy has changed to stop using -pthread.
x264r2210
git-id : 8f7644865010385efcb4cb5bd239b28edb4b49e2
revision : r2210
Author : Ronald S. Bultje
Date: Thu Jul 26 18:01:49 2012 -0700
x86inc: automatically insert vzeroupper for YMM functions
Backported from libav.
x264r2209
git-id : 68dfb7b352c4d273e44668c1f6e4a9a283a37e84
revision : r2209
Author : Kieran Kunhya
Date: Tue Jul 24 08:47:45 2012 -0700
Free user supplied data when deleting a frame
This eliminates a memory leak when calling x264_encoder_close.
x264r2208
git-id : d9d2288cabcfd1a90f29f2f11c8cce5450a08ffa
revision : r2208
Author : Jason Garrett-Glaser
Date: Wed Jul 18 08:33:41 2012 -0700
Revert r2204
People don't seem to like this so I'm just going to get rid of it.
x264r2207
git-id : 5f615f7f93d830e55e6fe4f04d214b93d8cb4b53
revision : r2207
Author : Jason Garrett-Glaser
Date: Tue Jul 10 14:10:44 2012 -0700
Faster predictor checking with subme<3
Fix a typo that made an early-skip less effective.
Avoid a relatively unpredictable branch.
Slightly changed output due to the typo-fix.
~50 cycles faster on Core i7.
x264r2206
git-id : 5af86bedd71c89fc48b50bbb7e8a8bec3d360d3a
revision : r2206
Author : Jason Garrett-Glaser
Date: Mon Jun 25 18:01:29 2012 -0700
Try 8x8 transform analysis even when sub8x8 partitions are present
Turn off the sub8x8 partitions, try it, and turn them back on if it didn't help.
Small compression improvement with p4x4 on (~0.1-0.5%).
Also update related comments.
x264r2205
git-id : 913485d26b19dddb6340f7115843d63cde8bb836
revision : r2205
Author : Jason Garrett-Glaser
Date: Fri Jun 8 18:19:59 2012 -0700
Support changing resolutions between passes with macroblock-tree
Implement a basic separable bilinear filter to rescale the quantizer offsets.
Structure inspired by swscale, but floating-point instead of fixed-point.
Not as optimized as it could be, but it's quite fast already.
Example compression penalties on a 720p video game recording:
First pass with 720p and second as 480p: ~-1.5% (vs. same res)
First pass with 480p and second as 720p: ~-3% (vs. same res)
x264r2204
git-id : 8b535d9006d87e32c4ff939691b920da823ae85a
revision : r2204
Author : Alexander Prikhodko
Date: Tue Jun 12 20:21:35 2012 +0300
Print elapsed time in encoding progress indicator
x264r2203
git-id : 11e32c534a213168d8f466fb64bee75e1534d7af
revision : r2203
Author : Anton Mitrofanov
Date: Sat Jun 2 21:27:50 2012 +0400
Cap ratecontrol predictor parameters
Limits VBV mispredictions after long periods of relatively constant video.
x264r2202
git-id : e21e9c972ed830ac7ad264912b41543adf7e720f
revision : r2202
Author : Loren Merritt
Date: Tue Jul 3 14:38:04 2012 -0700
x86inc: import patches from libav
Allow manual invocation of WIN64_SPILL_XMM even under INIT_MMX
SSE version of mova is movaps rather than movdqa.
YMM version of movnta.
Add mp size for named arguments.
Fix DEFINE_ARGS when used outside of a cglobal.
Define a few more cpuflags.
3-argument wrappers for a few more instructions.
x264r2201
git-id : 37be55213a39db40cf159ada319bd482a1b00680
revision : r2201
Author : Anton Mitrofanov
Date: Fri Jun 22 22:02:24 2012 +0400
Fix crash with --fps 0
Fix some integer overflows and check input parameters better.
Also fix incorrect type specifiers for demuxer info printing.
x264r2200
git-id : 999b753ff0f4dc872077f4fa90d465e948cbe656
revision : r2200
Author : Jason Garrett-Glaser
Date: Tue May 8 15:42:56 2012 -0700
Threaded lookahead
Split each lookahead frame analysis call into multiple threads. Has a small
impact on quality, but does not seem to be consistently any worse.
This helps alleviate bottlenecks with many cores and frame threads. In many
case, this massively increases performance on many-core systems. For example,
over 100% faster 1080p encoding with --preset veryfast on a 12-core i7 system.
Realtime 1080p30 at --preset slow should now be feasible on real systems.
For sliced-threads, this patch should be faster regardless of settings (~10%).
By default, lookahead threads are 1/6 of regular threads. This isn't exacting,
but it seems to work well for all presets on real systems. With sliced-threads,
it's the same as the number of encoding threads.
x264r2199
git-id : ecfbf9d8025e39783bc4262dc1972ca742d8a993
revision : r2199
Author : Anton Mitrofanov
Date: Fri May 4 17:18:12 2012 +0400
Add support for RGB formats in bit-depth conversion filter
x264r2198
git-id : 1c97f3570fba02f768fbf649b9f7d48beb720048
revision : r2198
Author : Anton Mitrofanov
Date: Sat May 12 13:57:49 2012 +0400
Fix some bugs in mb_info code
x264r2197
git-id : 69a0443e7d8ab032a7f3c3468a42177d5e64daa2
revision : r2197
Author : Jason Garrett-Glaser
Date: Thu Mar 29 14:14:07 2012 -0700
Add mb_info API for signalling constant macroblocks
Some use-cases of x264 involve encoding video with large constant areas of the frame.
Sometimes, the caller knows which areas these are, and can tell x264.
This API lets the caller do this and adds internal tracking of modifications to macroblocks to avoid problems.
This is really only suitable without B-frames.
An example use-case would be using x264 for VNC.
x264r2196
git-id : df6252cfed7c23fbe883456f4e0607a7f8e91ad8
revision : r2196
Author : Henrik Gramner
Date: Sat Apr 7 00:40:09 2012 +0200
Faster chroma weight cost calculation
New assembly function with SSE2, SSSE3 and XOP implementations for calculating absolute sum of differences.
x264r2195
git-id : cce88ebc9e517b0fa8735b81ac30b4e6a79c8154
revision : r2195
Author : Lucien
Date: Sat Mar 31 13:42:49 2012 +0100
Add Level 5.2 support
x264r2194
git-id : ee30c84e38b30896ffa6ddc417f3b4c281a86d1a
revision : r2194
Author : Henrik Gramner
Date: Thu Apr 12 19:14:43 2012 +0200
Eradicate all mention of Extended Profile
x264 never supported it and never will because nobody uses it.
x264r2193
git-id : 8ca49cc5c40813d8b98544989eb684e167b06aa0
revision : r2193
Author : Anton Mitrofanov
Date: Tue Apr 3 21:46:52 2012 +0400
Fix disabling of mbtree when using 2pass encoding and zones
x264r2192
git-id : 3691332c0b33a68f9d6f519edaa2b848ed34a38c
revision : r2192
Author : Alexander Prikhodko
Date: Sat Mar 31 12:06:21 2012 +0300
configure: force select -mXX gcc option for i386/x86-64
Makes multilib compilation more convenient.
x264r2191
git-id : e1ccbf9bb3abdd25d3f0c76682926ec49f3f8001
revision : r2191
Author : Rafa谷l Carr辿
Date: Sun Apr 15 21:20:14 2012 -0400
Update config.guess and config.sub
Adds support for a bunch of targets, including:
aarch64 (armv8)
arm-linux-androideabi
x264r2190
git-id : f87619768dba73c1effbcfb08875d096575e079e
revision : r2190
Author : Alexander Prikhodko
Date: Sat Mar 31 11:33:41 2012 +0300
configure: correct use of RC variable and add --extra-rcflags
x264r2189
git-id : 35cf912671fddcb3e701bf667a75f77dd8b28264
revision : r2189
Author : Steven Walters
Date: Wed Mar 28 21:15:04 2012 -0400
ICL/MSVS: Fix shared library generation and usage
MSVS requires exported variables to be declared with the DATA keyword, and requires that imported variables be declared with dllimport.
This does not fix x264 cli being unable to use a shared library built by ICL however.
x264r2188
git-id : 259a6e57ae25c71acc1669e0aefde7ffe7e235ec
revision : r2188
Author : Kieran Kunhya
Date: Tue Mar 27 17:38:56 2012 +0100
Fix intra-refresh + hrd
x264r2187
git-id : e0351cdfeb45bf7f891eeb1dc475292154bb9d82
revision : r2187
Author : Anton Mitrofanov
Date: Sun Mar 25 17:34:24 2012 +0400
Fix frame input colorspace check
x264r2186
git-id : 7392c8c31f791e9b4c10e4959f8715c8a8233d25
revision : r2186
Author : Jason Garrett-Glaser
Date: Thu Mar 22 13:56:50 2012 -0700
Fix comment in deblock.c
The code does, in fact, handle CAVLC+8x8dct correctly already.
x264r2185
git-id : 6979713216d792e44e3cbaeeba74b455e0a07c62
revision : r2185
Author : Jason Garrett-Glaser
Date: Tue Mar 13 14:37:26 2012 -0700
Fix sliced-threads ratecontrol bug
Was using qp instead of qscale; could cause NANs (not to mention less accurate results).
x264r2184
git-id : 5c85e0a2b7992fcaab09418e3fcefc613cffc743
revision : r2184
Author : Anton Mitrofanov
Date: Sun Mar 11 23:08:18 2012 -0700
Fix clobbering of mutex/cvs
Regression in r2183.
Bizarrely seemed to work on many platforms, but crashed on win64 and may have been slower.
Only affected sliced threads during encoding, but could cause crashes on x264 encoder close even without sliced threads.
x264r2183
git-id : c522ad1fed167d0e985e4f9dcdee042473cf74db
revision : r2183
Author : Jason Garrett-Glaser
Date: Fri Feb 24 13:34:39 2012 -0800
Sliced-threads: do hpel and deblock after returning
Lowers encoding latency around 14% in sliced threads mode with preset superfast.
Additionally, even if there is no waiting time between frames, this improves parallelism, because hpel+deblock are done during the (singlethreaded) lookahead.
For ease of debugging, dump-yuv forces all of the threads to wait and finish instead of setting b_full_recon.
x264r2182
git-id : 6a27a481d4c3508ce778a61a139a4734bb8126f7
revision : r2182
Author : Jason Garrett-Glaser
Date: Fri Feb 24 13:16:52 2012 -0800
Add full-recon API option
Fully reconstruct frames even without dump-yuv.
x264r2181
git-id : e856755d2a67f45249c24cb51aa38fc4fa192321
revision : r2181
Author : Jason Garrett-Glaser
Date: Wed Feb 22 13:33:36 2012 -0800
x86inc: switch to amdnops
Recent AMD CPUs' instruction decoders choke horribly on extremely long nops (i.e. with 4 prefixes).
Won't affect much, since we don't use ALIGN much.
x264r2180
git-id : 5a242c5862baaa4bd5829bd1b43dc11cf5c86344
revision : r2180
Author : Jason Garrett-Glaser
Date: Tue Feb 14 16:54:03 2012 -0800
BMI1 decimate functions
Intel was nice enough to make tzcnt equal to "rep bsf", which is backwards-compatible.
This means we don't actually have to add new functions to make it work.
x264r2179
git-id : ac31c59a98c6c690894670b9c9af2612f799d85b
revision : r2179
Author : Jason Garrett-Glaser
Date: Tue Feb 14 15:07:10 2012 -0800
Minor asm changes
x264r2178
git-id : 83561e55dde06f3247aa9b99fa62ead38d7a406e
revision : r2178
Author : Jason Garrett-Glaser
Date: Thu Feb 9 14:23:52 2012 -0800
Add row-reencoding support to VBV for improved accuracy
Extremely accurate, possibly 100% so (I can't get it to fail even with difficult VBVs).
Does not yet support rows split on slice boundaries (occurs often with slice-max-size/mbs).
Still inaccurate with sliced threads, but better than before.
x264r2177
git-id : 5a69f8e105663497794d4bb4e58cf7baa5cd29cb
revision : r2177
Author : Jason Garrett-Glaser
Date: Thu Feb 9 12:38:44 2012 -0800
Abstract bitstream backup/restore functions
Required for row re-encoding.
x264r2176
git-id : 037d123cf62c4af2dc13742b8606882b6d0d3d9e
revision : r2176
Author : Anton Mitrofanov
Date: Thu Feb 9 15:27:53 2012 -0800
Add an small per-MB cost penalty for lowres
Helps avoid VBV predictors going nuts with very low-cost MBs.
One particular case this fixes is zero-cost MBs: adaptive quantization decreases the QP a lot, but (before this patch), no cost penalty gets factored in for this, because anything times zero is zero.
x264r2175
git-id : de5a0adca1a7d08b1233b317ec092dbf19263d2f
revision : r2175
Author : Jason Garrett-Glaser
Date: Mon Feb 13 18:31:51 2012 -0800
Remove explicit run calculation from coeff_level_run
Not necessary with the CAVLC lookup table for zero run codes.
x264r2174
git-id : 9f1ac3b36eb2666e9d2ec4b859f3b63f60827bf0
revision : r2174
Author : Jason Garrett-Glaser
Date: Mon Feb 13 13:20:06 2012 -0800
Export PSNR/SSIM in x264 API
x264r2173
git-id : 7e85ec036df4290697239f5dc9f4a793313ceebc
revision : r2173
Author : Ronald S. Bultje
Date: Wed Feb 8 13:10:31 2012 -0800
x86inc: support yasm -f win64
Not necessary for x264, as -m amd64 already does the right thing, but used by external users of x86inc.
x264r2172
git-id : 02c3d5ec58d6bcbc5e22715ae80d53d8556f3c8f
revision : r2172
Author : Henrik Gramner
Date: Wed Feb 1 23:52:48 2012 +0100
Fix incorrect zero-extension assumptions in x86_64 asm
Some x264 asm assumed that the high 32 bits of registers containing "int" values would be zero.
This is almost always the case, and it seems to work with gcc, but it is *not* guaranteed by the ABI.
As a result, it breaks with some other compilers, like Clang, that take advantage of this in optimizations.
Accordingly, fix all x86 code by using intptr_t instead of int or using movsxd where neccessary.
Also add checkasm hack to detect when assembly functions incorrectly assumes that 32-bit integers are zero-extended to 64-bit.
x264r2171
git-id : 01f7a333e6c6a6d91a7fe977b491a448ddf4c117
revision : r2171
Author : Jason Garrett-Glaser
Date: Thu Feb 23 09:11:23 2012 -0800
Fix possible alignment crash when linking from MSVC
x264_cavlc_init needs to be stack-aligned now.
x264r2170
git-id : b17c247178a24c218843639c3f46bcfde0edab0a
revision : r2170
Author : Anton Mitrofanov
Date: Tue Feb 21 12:58:22 2012 -0800
Fix rare overflow in 10-bit intra_satd_x3_16x16 asm
x264r2169
git-id : 1446fe7c47cf660d764b4cbf53694bc3df9b04de
revision : r2169
Author : Steven Walters
Date: Sat Feb 11 22:56:43 2012 -0500
ICL: fix out of tree building and resource file usage on Windows
x264r2168
git-id : d3efb00abbedd2bbb70156bd989beefe06468116
revision : r2168
Author : Oka Motofumi
Date: Mon Feb 6 06:07:34 2012 +0900
Add error handling for out-of-tree build
x264r2167
git-id : ec41b19edc67ee4eca09c0e3b37e6290844c5e1f
revision : r2167
Author : Anton Mitrofanov
Date: Tue Mar 6 17:34:02 2012 +0400
Fix RGB colorspace input
BGR/BGRA input was correct.
x264r2166
git-id : 39a4c6fecaaa0d6cde8d89d31ef6cd1d25ab802b
revision : r2166
Author : Jason Garrett-Glaser
Date: Mon Feb 13 16:40:32 2012 -0800
Fix interlaced + extremal slice-max-size
Broke if the first macroblock in the slice exceeded the set slice-max-size.
x264r2165
git-id : 3f72c99a15a07511b758d9e94217223480865124
revision : r2165
Author : Henrik Gramner
Date: Sun Feb 5 20:43:09 2012 +0100
Fix regression in r2141
Broke register preservation in x264_cpu_cpuid and x264_cpu_xgetbv.
Did not cause any problems.
x264r2164
git-id : da19765d723b06a1fa189478e9da61a1c18490f8
revision : r2164
Author : Jason Garrett-Glaser
Date: Thu Jan 19 14:56:54 2012 -0800
TBM, AVX2, FMA3, BMI1, and BMI2 CPU detection support
TBM and BMI1 are supported by Trinity/Piledriver.
The others (and BMI1) will probably appear in Intel's upcoming Haswell.
Also update x86inc with AVX2 stuff.
x264r2163
git-id : efef20090a06a38f9d95755588d7830fb92a2a02
revision : r2163
Author : Loren Merritt
Date: Fri Feb 3 06:27:18 2012 +0000
x86inc: add TAIL_CALL macro to abstract a common asm idiom
x264r2162
git-id : a7e6e1793b4d2b49c9449d767320c71daa855cb6
revision : r2162
Author : Jason Garrett-Glaser
Date: Wed Jan 25 16:44:38 2012 -0800
Minor asm optimizations/cleanup
x264r2161
git-id : 56ba096141d16ffcbabd805e2d27014f62f0d722
revision : r2161
Author : Jason Garrett-Glaser
Date: Tue Jan 24 19:03:58 2012 -0800
Clean up and optimize weightp, plus enable SSSE3 weight on SB/BDZ
Also remove unused AVX cruft.
x264r2160
git-id : 961a278e0123eb662b46a6f136a48a43f6a2d427
revision : r2160
Author : Jason Garrett-Glaser
Date: Mon Jan 23 18:57:58 2012 -0800
XOP frame_init_lowres
Covers both 8-bit and 16-bit, ~5-10% faster on Bulldozer.
x264r2159
git-id : c5809994990df6c63b4250546844dc77181fee0f
revision : r2159
Author : Jason Garrett-Glaser
Date: Tue Jan 17 15:25:10 2012 -0800
XOP 8x8 zigzags
Field: 35(mmx) ->16(xop) cycles
Frame: 32(ssse3)->20(xop) cycles
x264r2158
git-id : 14dc11f7c52fa29576e0003c8c16857a78bf5fbf
revision : r2158
Author : Jason Garrett-Glaser
Date: Mon Jan 23 15:09:38 2012 -0800
AVX 32-bit hpel_filter_h
Faster on Sandy Bridge.
Also add details on unsuccessful optimizations in these functions.
x264r2157
git-id : 2fcd0446b5d91ae52e143682c30000a49441e4a1
revision : r2157
Author : Jason Garrett-Glaser
Date: Fri Jan 27 16:29:30 2012 -0800
x86inc: add high halfword register support
Might be useful in a few cases.
x264r2156
git-id : 5c4b8484ea9aaabfb70523ba1f9c4d8343ad3221
revision : r2156
Author : Ronald S. Bultje
Date: Wed Jan 25 13:53:59 2012 +0800
Change %ifdef directives to %if directives in *.asm files
This allows combining multiple conditionals in a single statement.
x264r2155
git-id : 1b558de42dc08a303c2faf79fc9999b48a876370
revision : r2155
Author : Anton Mitrofanov
Date: Sun Jan 22 22:13:52 2012 +0400
Use TV range algorithm for bit-depth conversions
Such sources are more common, so better to be correct for the common case.
This also produces less error for the case of full range than the previous algorithm produced for the case of TV range.
x264r2154
git-id : 83c371deba853a4ebb28739e868df86b3153fb3e
revision : r2154
Author : Hii
Date: Wed Jan 25 16:29:22 2012 +0800
Bump dates to 2012
x264r2153
git-id : a2925c5a707e833c34fa0a64d497c02e6dcfe6e6
revision : r2153
Author : Henrik Gramner
Date: Sat Jan 28 21:38:27 2012 +0100
Add Windows resource file
Displays version info in Windows Explorer.
x264r2152
git-id : 98ade832d053f6bfca4d0dd2ab0cd1c88531721d
revision : r2152
Author : Sergey Radionov
Date: Mon Jan 16 13:22:44 2012 -0800
Fix win32 pthread_cond_signal
Isn't used by x264 currently, so didn't cause a problem.
Fix backported from libav.
x264r2151
git-id : a3f44077dc238dea92c0894d352b5a8723b9201b
revision : r2151
Author : Mans Rullgard
Date: Wed Feb 1 15:55:25 2012 -0800
ARM: align asm functions to 4 bytes.
Some linkers apparently fail to correctly align ARM functions when mixing with Thumb code.
x264r2150
git-id : d3a39c92f5c130cad6d45e9daffa5a2beb145ebb
revision : r2150
Author : Anton Mitrofanov
Date: Sun Jan 22 13:00:23 2012 +0400
Fix normalization of colorspace when input is packed YUV 4:2:2
x264r2149
git-id : 236763e39d8a756db0e8179745396ed88c1bfc2d
revision : r2149
Author : Jason Garrett-Glaser
Date: Sat Jan 21 12:54:40 2012 -0800
Force keyint-min 1 with Blu-ray
Fixes an issue with referencing across I-frames that's prohibited in Blu-ray for some godforsaken reason.
x264r2148
git-id : 8a1189abd1355c4cf6f786fbc2a4b8c22f398710
revision : r2148
Author : Oka Motofumi
Date: Sun Jan 29 20:34:41 2012 +0900
Fix crash in --demuxer y4m with unsupported colorspace
x264r2147
git-id : 1ab0877a40417a2f4f26ff0356e8b02182d9d996
revision : r2147
Author : Anton Mitrofanov
Date: Mon Jan 16 14:02:53 2012 -0800
Fix overread/possible crash with intra refresh + VBV
x264r2146
git-id : bcd41dbcaa4430b2118d9f6828c2b9635cf9d58d
revision : r2146
Author : Loren Merritt
Date: Wed Jan 18 15:47:07 2012 -0800
Fix trellis 2 + subme >= 8
Trellis didn't return a boolean value as it was supposed to.
Regression in r2143-5.
x264r2145
git-id : 748fe16c1303b89d2a1d0378addd83fb4198f51a
revision : r2145
Author : Loren Merritt
Date: Fri Jan 6 15:53:29 2012 +0000
CABAC trellis opts part 4: x86_64 asm
Another 20% faster.
18k->12k codesize.
This patch series may have a large impact on encoding speed.
For example, 24% faster at --preset slower --crf 23 with 720p parkjoy.
Overall speed increase is proportional to the cost of trellis (which is proportional to bitrate, and much more with --trellis 2).
x264r2144
git-id : cfdb36ece729209631f7213506685ae733d7f5d4
revision : r2144
Author : Loren Merritt
Date: Fri Jan 6 15:53:04 2012 +0000
CABAC trellis opts part 3: make some arrays non-static
x264r2143
git-id : 65bd12ae875a768a06b67ec6297dec18323e0768
revision : r2143
Author : Loren Merritt
Date: Thu Dec 22 17:56:06 2011 +0000
CABAC trellis opts part 2: C optimizations
Hoist the branch on coef value out of the loop over node contexts.
Special cases for each possible coef value (0,1,n).
Special case for dc-only blocks.
Template the main loop for two common subsets of nodes, to avoid a bunch of branches about which nodes are live.
Use the nonupdating version of cabac_size_decision in more cases, and omit those bins from the node struct.
CABAC offsets are now compile-time constants.
Change TRELLIS_SCORE_MAX from a specific constant to anything negative, which is cheaper to test.
Remove dct_weight2_zigzag[], since trellis has to lookup zigzag[] anyway.
60% faster on x86_64.
25k->18k codesize.
x264r2142
git-id : e176619d010fc32c970c7ab7a769bbfbe2665f61
revision : r2142
Author : Loren Merritt
Date: Thu Dec 22 17:55:06 2011 +0000
CABAC trellis opts part 1: minor change in output
Due to different tie-break order.
x264r2141
git-id : 4e87f36a0e1a78242f04db611e06f80b6b38d900
revision : r2141
Author : Henrik Gramner
Date: Sun Jan 8 04:14:10 2012 +0100
x86inc improvements for 64-bit
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments
x264r2140
git-id : 84a06e611aff1267a720bf9552b3bcf263bd83b5
revision : r2140
Author : Ilia Valiakhmetov
Date: Sun Jan 15 04:47:58 2012 -0600
High bit depth SSE2/AVX add8x8_idct8 and add16x16_idct8
From Google Code-In.
x264r2139
git-id : c605e3174410ba5c7d1d0a777082e2397734d637
revision : r2139
Author : Edward Wang
Date: Wed Jan 4 15:35:54 2012 -0800
MMX/SSE2/AVX predict_8x16_p, high bit depth fdct8
From Google Code-In.
x264r2138
git-id : 6b06f6d3f7f800dca1a4ea154f54427d5b3cea2b
revision : r2138
Author : Jason Garrett-Glaser
Date: Thu Dec 22 14:03:15 2011 -0800
XOP 8-bit fDCT
Use integer MAC for one of the SUMSUB passes. About a dozen cycles faster for 16x16.
x264r2137
git-id : c4b54c83629bb92af6c4836a8859e9432dc7333a
revision : r2137
Author : Cristian Militaru
Date: Wed Jan 4 12:38:08 2012 -0800
High bit depth intra_sad_x3_4x4
From Google Code-In.
x264r2136
git-id : c032fbaa3801fb4cf8dd1dd95a6479ca5bd262e2
revision : r2136
Author : Jason Garrett-Glaser
Date: Thu Dec 8 13:45:41 2011 -0800
Use a large LUT for CAVLC zero-run bit codes
Helps the most with trellis and RD, but also helps with bitstream writing.
Seems at worst neutral even in the extreme case of a CPU with small L2 cache (e.g. ARM Cortex A8).
x264r2135
git-id : ebb1429e2d24f57aa4ea75284386a15f2eab553e
revision : r2135
Author : Matt Habel
Date: Fri Dec 16 23:16:09 2011 -0800
High bit depth intra_sad_x3_8x8, intra_satd_x3_4x4/8x8c/16x16
Also add an ACCUM macro to handle accumulator-induced add-or-swap more concisely.
x264r2134
git-id : 6d921c5bdefae1a733a3a4c29d88ea15fcece76e
revision : r2134
Author : Shitiz Garg
Date: Sat Dec 3 15:34:57 2011 -0800
MMX 10-bit predict_8x8c_h and predict_8x16c_h
From Google Code-In.
x264r2133
git-id : 47cdaa9c3d8197d4deb711d9bcc4af869ef8a426
revision : r2133
Author : Aaron Schmitz
Date: Wed Nov 30 00:15:45 2011 -0600
Some MBAFF x86 assembly functions.
deblock_chroma_420_mbaff, plus 422/422_intra_mbaff implemented using existing functions.
From Google Code-In.
x264r2132
git-id : 027b05e0a22421e477847506a205a49b151ae5bf
revision : r2132
Author : George Stephanos
Date: Thu Dec 1 16:53:45 2011 -0800
More ARM NEON assembly functions
predict_8x8_v, predict_4x4_dc_top, predict_8x8_ddl, predict_8x8_ddr, predict_8x8_vl, predict_8x8_vr, predict_8x8_hd, predict_8x8_hu.
From Google Code-In.
x264r2131
git-id : 658a3585b74f77fd8f78588f3f39e0abefb104c4
revision : r2131
Author : Ilia
Date: Mon Nov 28 05:20:09 2011 -0800
More 4:2:2 asm functions
High bit depth version of deblock_h_chroma_422.
Regular and high bit depth versions of deblock_h_chroma_intra_422.
High bit depth pixel_vsad.
SSE2 high bit depth and MMX 8-bit predict_8x8_vl.
Our first GCI patch this year!
x264r2130
git-id : 978abe065737089913feccffece483bc69a9e5b0
revision : r2130
Author : Henrik Gramner
Date: Thu Dec 8 16:14:35 2011 +0100
SSE2 and SSSE3 versions of sub8x16_dct_dc
Also slightly faster sub8x8_dct_dc
x264r2129
git-id : 61a78a1b417595c4b5d7ef6831692904a243a9fc
revision : r2129
Author : Steven Walters
Date: Mon Dec 5 08:46:34 2011 -0500
Resize filter updates
Use AVPixFmtDescriptors to pick the most compatible x264 csp for any pixel format.
Fix deprecated use of av_set_int.
Now requires libavutil >= 51.19.0
x264r2128
git-id : bc6c98cf4f76c779c8c07f43aa97ac29b1150bc0
revision : r2128
Author : Oka Motofumi
Date: Thu Jan 5 14:23:50 2012 -0800
Add out-of-tree build support
x264r2127
git-id : f33c8cb0f8fff7a83100b3e9d15baba53c6f6a35
revision : r2127
Author : Anton Mitrofanov
Date: Fri Dec 16 18:17:00 2011 +0400
Limit SSIM to 100db
Avoids floating point error for infinite SSIM (lossless).
x264r2126
git-id : c0d698859c36be611d465f968762f042853be817
revision : r2126
Author : Reynaldo H. Verdejo Pinochet
Date: Wed Jan 4 13:16:12 2012 -0300
Fix wrong conditional inclusion of inttypes.h
inttypes.h is required by encoder/ratecontrol.c for SCNxxx macros, and HAVE_STDINT_H does not imply having inttypes.h.
stdint.h is a subset of inttypes.h, but this isn't enough for x264.
This change fixes building x264 with Android's toolchain.
x264r2125
git-id : b081d179e741ceffee2217f6fda06779693dce56
revision : r2125
Author : Anton Mitrofanov
Date: Wed Dec 21 11:08:56 2011 +0400
Fix crash with sliced threads and input height <= 112
x264r2124
git-id : 64da5f9df46ac33a5a6b56ca1510d2082e6fbb62
revision : r2124
Author : Phillip Blucas
Date: Mon Dec 19 17:43:41 2011 -0600
Fix loading custom 8x8 chroma quant matrices in 4:4:4
x264r2123
git-id : 4c08e42504af81cdbe5789a309e868ca8eda2c1f
revision : r2123
Author : Anton Mitrofanov
Date: Fri Dec 16 01:48:07 2011 +0400
Fix PCM cost overflow
x264r2122
git-id : 489a9b2d04c4828877930d2a9104ce93dde8cb85
revision : r2122
Author : Anton Mitrofanov
Date: Fri Dec 9 01:54:22 2011 +0400
Fix overflow in 8-bit x86 vsad asm function
x264r2121
git-id : c291a9d09263708e9d9f02e28f8442fdbe46bb06
revision : r2121
Author : Anton Mitrofanov
Date: Wed Dec 7 19:14:52 2011 +0400
Fix crash in --fullhelp when compiled against recent ffmpeg
Don't assume all pixel formats have a description.
x264r2120
git-id : 0c7dab9c2a106ce3ee5d6ad7282afb49e1cc3954
revision : r2120
Author : Jason Garrett-Glaser
Date: Tue Dec 6 14:39:21 2011 -0800
Fix regression in r2118
Broke trellis with i16x16 macroblocks.
x264r2119
git-id : 0637cd67cb245fce5ba190fa4b9c341319ea2b37
revision : r2119
Author : Jason Garrett-Glaser
Date: Wed Nov 30 13:02:12 2011 -0800
Modify MBAFF chroma deblock functions to handle U/V at the same time
Allows for more convenient asm implementations.
x264r2118
git-id : 67f1fdc4d9c030568eac8cf9ab9d0bb249f520db
revision : r2118
Author : Jason Garrett-Glaser
Date: Thu Nov 10 16:16:13 2011 -0800
CABAC trellis optimizations: use SIMD quant
Significant speed increase, minor change in output due to rounding.
x264r2117
git-id : e047b3c475cd42b6647397a244e239ebfca53bf6
revision : r2117
Author : Steven Walters
Date: Sun Nov 6 09:48:30 2011 -0800
YUV range detection and support for x264CLI
Two new options: --input-range and --range.
--input-range forces the range of the input in case of misdetection; auto by default.
-- range sets the range of the output; x264cli will convert if necessary, TV by default.
--fullrange is now removed as a CLI option (but the libx264 API is unchanged).
x264r2116
git-id : 00df989cc06208050230756525633438d76b5a6a
revision : r2116
Author : Kieran Kunhya
Date: Fri Nov 4 20:09:13 2011 +0000
Pass through user data
x264r2115
git-id : 04a0aeefd2f5b152c5dbca4a1c6569bd27c9f721
revision : r2115
Author : Jason Garrett-Glaser
Date: Thu Oct 27 14:05:56 2011 -0700
Remove unpredictable branch in CABAC dqp
x264r2114
git-id : 4185ee883b04d9cee57a64fdebd153830b7b27ba
revision : r2114
Author : Loren Merritt
Date: Sun Oct 23 23:15:11 2011 +0000
x86inc: AVX symmetry optimization
3-arg AVX ops with a memory arg can only have it in src2,
whereas SSE emulation of 3-arg prefers to have it in src1 (i.e. the move).
So, if the op is symmetric and the wrong one is memory, swap them.
Eliminates redundant moves in some cases when using 3-operand without AVX with memory arguments.
Also fix movss and movsd in some cases, and flag shufps correctly as float.
x264r2113
git-id : cc129adcaaf5604f3d4fea9ebcb289403192a741
revision : r2113
Author : Anton Mitrofanov
Date: Tue Nov 29 13:45:13 2011 -0800
checkasm: shut up gcc warnings, fix some naming of functions in results
x264r2112
git-id : f0ccc98bb747b8ee0fe9329f4205cf382788bb89
revision : r2112
Author : Mans Rullgard
Date: Mon Nov 28 16:29:12 2011 -0800
checkasm: fix build on ARM
Because of how ALIGNED_ARRAY_16 is defined on ARM, array initialisers cannot be used here. Use memset() instead.
x264r2111
git-id : d8d8e756b1fee72b4771761d6aa4cfb31edc0b67
revision : r2111
Author : Anton Mitrofanov
Date: Sat Nov 12 01:31:49 2011 +0400
Improve makefile rules
Remove the need for "make clean" after most reconfigures.
x264r2110
git-id : e6d33a931c08918e78dcae97e4d80d0c3411bf2c
revision : r2110
Author : Anton Mitrofanov
Date: Sat Nov 12 00:47:48 2011 +0400
Mark some local functions as static, cosmetics
x264r2109
git-id : e0c11dc6e283569606aaa97767401c6a13c2529d
revision : r2109
Author : Anton Mitrofanov
Date: Fri Nov 11 23:19:02 2011 +0400
Fix crash if timecode file opening fails
x264r2108
git-id : a14db080c3fdba4cadc38152a292bb1fa216d50e
revision : r2108
Author : Fabian Greffrath
Date: Fri Nov 11 13:25:43 2011 -0800
Configure: force PIC for shared build on PARISC and MIPS
x264r2107
git-id : 6a0bd421bf5fd006012ddcd1be2072a8736b2d27
revision : r2107
Author : Anton Mitrofanov
Date: Sat Oct 22 19:41:07 2011 +0400
Improve yasm version check
Previous check allowed certain earlier versions that weren't fully compatible.
x264r2106
git-id : 07efeb45db224b7757880d4d63bb549fb454f6db
revision : r2106
Author : Jason Garrett-Glaser
Date: Tue Oct 18 14:30:26 2011 -0700
Add fenc prefetching to adaptive quant
Many fewer cache misses, faster adaptive quant.
x264r2105
git-id : 81a99842b76834c11a46438f354d7f2a9e89752a
revision : r2105
Author : Jason Garrett-Glaser
Date: Tue Oct 18 14:14:03 2011 -0700
Split prefetch_fenc between colorspaces
Add 4:2:2 version.
x264r2104
git-id : 9f872e137c16e8ee0a46d8ca00ac5d670c219d5f
revision : r2104
Author : Jason Garrett-Glaser
Date: Tue Oct 11 17:04:32 2011 -0700
Some more 4:2:2 x86 asm
coeff_last8, coeff_level_run8, var2_8x16, predict_8x16c_dc, satd_4x16, intra_mbcmp_8x16c_x3, deblock_h_chroma_422
x264r2103
git-id : f52aa86c184d69b4e97b0f63f5f27166b19aa280
revision : r2103
Author : Loren Merritt
Date: Tue Oct 11 18:12:43 2011 +0000
Remove obsolete versions of intra_mbcmp_x3
intra_mbcmp_x3 is unnecessary if x9 exists (SSSE3 and onwards).
x264r2102
git-id : 2f0384dcd68bb85f98fb566b70b863b40082c83e
revision : r2102
Author : Loren Merritt
Date: Mon Oct 10 05:42:36 2011 +0000
SSSE3/SSE4/AVX 9-way fully merged i8x8 analysis (sa8d_x9)
x86_64 only for now, due to register requirements (like sa8d_x3).
i8x8 analysis cycles (per partition):
penryn sandybridge bulldozer
616->600 482->374 418->356 preset=faster
892->632 725->387 598->373 preset=medium
948->650 789->409 673->383 preset=slower
x264r2101
git-id : 46d1f3ab24e8aead7ccb3f89a382e7c92721ba96
revision : r2101
Author : Jason Garrett-Glaser
Date: Fri Sep 30 19:09:19 2011 -0700
SSSE3/SSE4/AVX 9-way fully merged i8x8 analysis (sad_x9)
~3 times faster than current analysis, plus (like intra_sad_x9_4x4) analyzes all modes without shortcuts.
x264r2100
git-id : 077d4532c9d9c7914e31ef9250096cc379042bcb
revision : r2100
Author : Loren Merritt
Date: Wed Oct 5 13:29:21 2011 -0700
Merge i4x4 prediction with intra_mbcmp_x9_4x4
Avoids a redundant prediction after analysis.
x264r2099
git-id : 55a9d38348a1d0bee687293e194b018b21a6ad96
revision : r2099
Author : Jason Garrett-Glaser
Date: Wed Oct 5 13:17:31 2011 -0700
Inline i4x4/i8x8 encode into intra analysis
Larger code size, but faster.
x264r2098
git-id : afd9bc24823b0f9f0727c0332a0db24db66876d2
revision : r2098
Author : Jason Garrett-Glaser
Date: Wed Sep 21 17:12:10 2011 -0700
Initial XOP and FMA4 support on AMD Bulldozer
~10% faster Hadamard functions (SATD/SA8D/hadamard_ac) plus other improvements.
x264r2097
git-id : 8cf50493e5b80d9e33aaf0c9d55551d6411e1be4
revision : r2097
Author : Mans Rullgard
Date: Tue Sep 27 21:14:14 2011 +0400
ARM: update NEON chroma deblock functions to NV12 pixel format
x264r2096
git-id : f7e640a33fe66838ecece1da267b566342f3be24
revision : r2096
Author : Sean McGovern
Date: Mon Oct 17 12:45:15 2011 -0700
Add /usr/lib/{64/}values-xpg6.o to $LDFLAGS on Solaris
This is required for POSIX.1-2001 compliance.
x264r2095
git-id : a0ce295b33a1ba87f732e661e22dba1a307e3405
revision : r2095
Author : Sean McGovern
Date: Mon Oct 17 12:44:03 2011 -0700
Fix linker test for -Bsymbolic
The Solaris linker only accepts -Bsymbolic for objects compiled in dynamic mode (i.e. shared objects), so pass -shared to gcc.
Additionally, for x86_32 unresolved textrels cause a linker error so mark the .text section as 'impure'.
x264r2094
git-id : 9aa0f65f72514cfa8c478fbffdafd937c70b5f5d
revision : r2094
Author : Sean McGovern
Date: Mon Oct 17 12:43:28 2011 -0700
Add $SOFLAGS to exported SOFLAGS make variable
x264r2093
git-id : d32d091d519c5ff710b2fb7b2f255fd510e4a6d8
revision : r2093
Author : Henrik Gramner
Date: Sat Sep 24 15:56:08 2011 +0200
Allow setting a chroma format at compile time
Gives a slight speed increase and significant binary size reduction when only one chroma format is needed.
x264r2092
git-id : 6eac7c35a5da6c176cedc2644c53ff9d019f7fb0
revision : r2092
Author : Harfe Leier
Date: Fri Sep 30 12:49:33 2011 -0700
Improve profile help
List high422/high444 profiles, and don't show non-high-bit-depth profiles in high bit depth builds.
x264r2091
git-id : 896fff46dd9a0fba9bb5285d536d03e0d5f86da2
revision : r2091
Author : Yusuke Nakamura
Date: Thu Oct 20 03:09:51 2011 +0900
Fix infinite loop parsing TDecimate Mode 3 timecode v1 files
x264r2090
git-id : 2697313a6f223f0b270ba9533c6b47967fa7d246
revision : r2090
Author : Jason Garrett-Glaser
Date: Mon Oct 10 17:44:31 2011 -0700
Fix some integer overflows/signedness errors found by IOC
The only real bug here is in slicetype.c, which may or may not affect real encodes.
x264r2089
git-id : d2594831dd858d6ed8efcfd4160ea5ac7f1357c7
revision : r2089
Author : Jason Garrett-Glaser
Date: Wed Oct 12 09:16:32 2011 -0700
Fix pixel_var2 with 4:2:2 encoding
Might have caused artifacts or suboptimal chroma compression.
x264r2088
git-id : 1fe87df5857266f0099a473962e7f32a89d9b701
revision : r2088
Author : Anton Mitrofanov
Date: Sun Oct 9 19:14:16 2011 +0400
Fix chroma intra analysis in 4:4:4 lossless mode
x264r2087
git-id : c4644d878dc82f8812482f660f651948d53d4b43
revision : r2087
Author : Anton Mitrofanov
Date: Sun Oct 9 01:13:29 2011 +0400
Fix use of uninitialized MVs in sub8x8 RDO
x264r2086
git-id : f8825a4a6f827bb28fffb75a7cc1a6c386088828
revision : r2086
Author : Fabian Greffrath
Date: Fri Oct 7 19:04:17 2011 -0700
Fix detection of Alpha CPU arch on alphaev67
x264r2085
git-id : 8a62835b0b669e79c75b6522b1f7632fe16105d9
revision : r2085
Author : Jason Garrett-Glaser
Date: Wed Sep 14 14:53:04 2011 -0700
Optimize x86 asm for Intel macro-op fusion
That is, place all loop counter tests right before their conditional jumps.
x264r2084
git-id : f7cd45b9bbc1f7f5bfd2df6421e79895655552ab
revision : r2084
Author : Jason Garrett-Glaser
Date: Mon Sep 12 11:51:23 2011 -0700
CAVLC: clean up and restructure
Somewhat faster CAVLC and RD bit-counting.
x264r2083
git-id : 4a89c200a2c50f17bcf657f3254f6f05b2c0df41
revision : r2083
Author : Jason Garrett-Glaser
Date: Thu Sep 8 17:27:02 2011 -0700
CABAC: clean up and restructure
Somewhat faster CABAC and RD bit-counting.
x264r2082
git-id : 62fc472989765a6bea4485c8988d7b246e7ceeb5
revision : r2082
Author : Jason Garrett-Glaser
Date: Sun Sep 4 11:31:29 2011 +0200
Some initial 4:2:2 x86 asm
x264r2081
git-id : bb9216dc319a39eb6f2a5508a98e36d6492ffa7e
revision : r2081
Author : Henrik Gramner